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Sir: 

1 . I, Cameron Jennings, PhD, declare and say I am a resident of 1 Janice Street, Warners 
Bay, New South Wales 2282, Australia. 

2. I am an employee of Energy and Management Services Pty Ltd. My previous scientific 
employment consists of Immune System Therapeutics Ltd (Sydney), Harvard University 
(Boston) and La Trobe University (Melbourne). Details of my career as well as 
publications may be found in my curriculum vitae (Exhibit A). 

3 . I understand that US Patent Application No. 1 0/590,690 (referred to hereafter as the 
"Patent Application") is assigned to Immune System Therapeutics Limited. I presently 
own options which would allow me to purchase shares in Immune System Therapeutics 
Limited should the company become publicly listed. 

4. I have been asked by FB Rice, Patent Attorneys for Immune System Therapeutics 
Limited, to provide an independent opinion on the state of knowledge surrounding kappa 
and lambda light chains and on the invention described in the Patent Application. I have 
been asked in particular to comment on the obviousness rejections set out in the Office 
Action dated 1 1 September 2009 issued in connection with the Patent Application. 

5. I have read the Patent Application and have reviewed the claims that I understand are 
presently being considered by the United States Patent and Trademark Office. I 
understand that the claimed invention (referred to hereafter as the "Invention") relates to 
methods of treating B-cell disorders by administering an antibody or ligand that 
specifically binds to lambda myeloma antigen (LMA). The importance of this invention is 
that it provides, for the first time, LMA as a target that is selective for tumor B-cells 
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expressing LMA. Antibodies and ligands that bind LMA do not target normal B-cells 
which express lambda light chain in the context of intact immunoglobulin. 



6. I have considerable experience in the technical field of the claimed invention. I have over 
3 years of experience in research involving immunoglobulin light chains and membrane- 
bound proteins. Prior to this time, my research consisted of a senior post-doctoral 
fellowship at Harvard University researching membrane bound malaria vaccine candidates 
and a PhD that was co-supervised by Professor Marilyn Anderson (Biochemistry 
Department at La Trobe University) and Professor David Craik (Institute of Molecular 
Bioscience at The University of Queensland). 

7. I have reviewed the Office Action dated 1 1 September 2009 in relation to the above- 
referenced Patent Application. I understand that the Patent Office has taken the position 
that the Invention would be obvious to a person skilled in the art in light of Uhr et al. (US 
7,792,447), Raison et al. (WO 03/004056) and Abe et al. (1 993). 

8. Uhr et al. are cited as disclosing antibodies that bind intact immunoglobulin associated 
lambda light chain on tumor cells. Raison et al. are cited as disclosing that malignant 
cells express both kappa and lambda light chains, that free kappa light chain is expressed 
on the cell surface of kappa light chain expressing myeloma cells and that antibodies 
which bind free kappa light chain can be used to treat tumors. Abe et al. are cited as 
disclosing antibodies that bind free light chain and not intact immunoglobulin. 

9. It is my understanding that the Patent Application and the cited publications are to be 
viewed from the perspective of one of ordinary skill in the art in the relevant field (a 
"Skilled Person") at the time of filing of the Patent Application in question. I have been 
asked to consider this time to be the period around or before 27 February 2004 ("the 
Relevant Period"). I would expect a Skilled Person in the field of antibody therapy during 
the Relevant Period to have been represented by a scientist with a PhD degree in 
Biochemistry and/or at least 3 to 5 years experience in the field of Biochemistry, or an 
educational background at the same degree level in a related field and equivalent level of 
experience. 

10. I am very familiar with the technical field of the claimed invention. I am qualified to 
analyze literature in this field and to provide my opinion as to what literature in this field 
discloses or suggests to the Skilled Person at the Relevant Period. 
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11. By the Relevant Period I had attained at least the level of such a Skilled Person, and 
further in view of my qualifications discussed above, I believe that I am qualified by 
training and experience to address what a Skilled Person would have understood from 
reading the Patent Application and the cited publications. 

12. The Examiner's objection appears to be based on the premise that kappa and lambda light 
chains are the two known alleles of immunoglobulin light chain and therefore once free 
kappa light chain had been found expressed on the surface of myeloma cells, it would 
have been expected that free lambda light chains would also be expressed on the surface 
of malignant B cells. 

13. In my view, it is not correct to assume that the expression and localisation of the kappa 
light chain in myeloma cells is in any way predictive of the expression and localisation of 
lambda light chains in malignant B cells. 

14. Immunoglobulins are composed of two chains, light and heavy, which form a functional 
heterodimer. The light chain molecules are of two protein families, "kappa" and 
"lambda". The kappa and Lambda gene loci are located on chromosomes 2 and 22 
respectively and differ in their number of variable (V), joining (J) and constant (C) genes 
as well as their general arrangement. For example, there is a single C kappa gene (Zimmer 
et al, 1990) while there are at least 4 functional C lambda genes (Dariavach et al, 1987). 
While both kappa and lambda light chains maintain a conserved role when present in 
intact immunoglobulin, no normal biological function has been attributed to light chains 
alone. 

15. Although kappa and lambda light chains both comprise a conserved (C) and variable (V) 
domain (within the respective families) and they both complete a multimeric 
immunoglobulin, the proteins share minimal sequence identity within their C domains 
(Kabat et al., 1975). Furthermore, the variable domains of the kappa and lambda light 
chains are derived from genetic recombination events and the recombined genes also 
undergo the process of somatic hypermutaion to give rise to antigen binding moieties (in 
the context of the whole immunoglobulin). As such the V regions vary in their sequence 
as well. 

16. Kappa and lambda V domains have a typical Immunoglobulin fold, which is also referred 
to as the immunoglobulin beta sandwich, consisting of two antiparallel P-barrel sheets. 
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Although both kappa and lambda V domains comprise this typical Immunoglobulin fold, 
analysis of the crystal structure of light chains demonstrated that the C domains of lambda 
light chain and gamma heavy chain were structurally amenable to the formation of cross- 
molecule beta structures, but kappa light chains were not (Edmundson and Borrebaeck, 
1 998). Thus, it was known that there were differences in the structural interactions 
between lambda light chains and immunoglobulin heavy when compared to the 
interactions of kappa light chain and heavy chain. 

17. The immunoglobulin beta sandwich, referred to above, contains seven beta strands which 
form the sandwich of two beta sheets. This structural motif is found on a vast array of 
protein sub-families with diverse biological activities and sub-cellular locations (for a 
complete set of proteins in the SCOP data base, see (Murzin et al., 1995; available from 
http://scop.mrc-lmb.cam.ac.Uk/scop/data/scop.b.c.b.html). Proteins that fall within this 
structural family include Myelin oligodendrocyte glycoprotein (MOG), the T-cell antigen 
receptor and TREM-1 (triggering receptor expressed on myeloid cells) to name a few. 

18. As further evidence of differences in structure of kappa and lambda light chain, the 
Peptostreptococcus magnus protein L binds kappa light chains but does not bind lambda 
light chains (Graille et al., 2001). This again suggests variation in sequence and local 
structure of the kappa and lambda light chains 

1 9. The differences in the structures of the kappa and lambda light chains is further 
highlighted by observations indicating that lambda antibodies are found in two-thirds of 
light chain Amyloidosis cases, whereas kappa light chains mediate greater than 85% of 
Light-Chain Deposition Disease (LCDD). Thus, the light chain composition appears to 
confer different pathologies to kappa or lambda light chains (Solomon and Weiss, 1 995). 

20. The difference in the structure of amyloid fibrils (fibrilar) and the LCDD deposits 
(amorphous) also reflects the difference in the general structure of kappa and lambda light 
chains (Khurana et al., 2001). Moreover, lambda light chains exist predominantly as 
dimers while kappa light chains are mostly present as monomers (Solomon and Weiss, 
1995). As a consequence, the character and rate of the catabolic processes that are 
involved in the clearance of kappa and lambda light chains are different. It has been 
suggested that this phenomenon contributes to the predominance of lambda light chains in 
amyloidosis (Solomon and Weiss, 1995). 
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21. There exists three broad categories of surface expressed proteins: i) integrally associated 
proteins possessing hydrophobic surfaces that readily interact with the acyl core of the 
bilayer; ii) membrane proteins that are covalently attached to certain phospholipids; and 
iii) peripheral proteins that associate with the membrane through charge-charge 
electrostatic interactions (Lodish et al. (2000) Molecular and Cellular Biology 4* Ed,, 
Section 3.4) WH Freeman, New York). Examination of the sequence of kappa and lambda 
light chains does not reveal candidate residues suitable for covalent attachment to the 
membrane. Thus, it can be surmised that the membrane interaction of either kappa or 
lambda light chains would occur via hydrophobic and/or electrostatic interactions. 

22. Differences in the primary sequence of kappa and lambda light chains affects not only the 
charge of the exposed side chains of the proteins but also their predominant presence as 
either monomers or dimers respectively. Thus, in view of the differences in primary 
sequence of kappa and lambda light chains, which in turn affects the presence/absence of 
exposed hydrophobic surfaces, and the lack of a membrane targeting sequence in these 
proteins, the expression of either of these two proteins on the cell surface could not be 
predicted. Thus, it could not have been predicted by analysis of the structure or sequence 
of kappa or lambda light chains that either of these proteins would associate with the 
membrane of malignant B cells. 

23. In my opinion, therefore, there is nothing in the cited prior art to suggest that a Skilled 
Person could have predicted that free lambda light chain would be associated with the 
membrane of tumor B-cells. There was no suggestion nor motivation, therefore, around 
February 2004 for a Skilled Person to investigate free lambda light chains as a potential 
therapeutic target on tumor B-cells. 

24. I hereby declare that all statements made herein of my own knowledge are true and that 
all statements made on information and belief are believed to be true; and further that 
these statements were made with the knowledge that willful false statements and the like 
so made are punishable by fine or imprisonment, or both, under § 1001 of Title XVIII of 
the United States Code, and that such willful false statements may jeopardize the validity 
of the Patent Application or any patent issuingthgres* 





Date 



Name: Cameron Jennings, PhD 
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EXHIBIT A 
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Dr Cameron Victor Jennings 

1 Janice Street, Warners Bay NSW 2282 
Australia 
+61430095522 
camJennings@hotmail.com 



EDUCATION: 

1. Post-Graduate Diploma in Energy Studies (Awarded in 2010) 

Murdoch University, School of Engineering and Energy 
Perth, Australia 

2. Doctor of Philosophy (awarded in 2003) 

La Trobe University, Department of Biochemistry 
Melbourne Australia, 

3. Bachelor of Science (First Class Honours- awarded in 1999) 

La Trobe University 
Melbourne Australia. 



WORK EXPERIENCE: 

1. Senior Scientist/Project Leader (July 2007 - Present) 
Immune System Therapeutics LTD. Sydney Australia. 

Role: Supervise and drive projects that increase the intellectual property profile of 
Immune System Therapeutics Ltd. 

2. Scientist (2006-July 2007) 

Immune System Therapeutics LTD. Sydney Australia. 

Role: Drive projects that increase the intellectual property profile of the company 

3. Senior Post-Doctoral Research Fellow (2003-2006) 
Harvard University, School of Public Health, 

Department of Immunology and Infectious Diseases, Boston USA. 

Role: Perform and supervise research into Malaria vaccine candidates 

4. Laboratory class teaching assistant - Undergraduate Biochemistry (1999-2001) 
La Trobe University, Department of Biochemistry. Melbourne, Australia. 

Role: Supervise undergraduate students in practical classes 



5. Research Assistant, Phillips Laboratory (1997-1999). 

La Trobe University, Department of Biochemistry. Melbourne, Australia. 

Role: Support research into potential anti-cancer compounds 
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OTHER ACADEMIC TRAINING: 

1. Environmental Auditor Certification Workshop (July, 2009) 
Thomson - Reuters 

Aim: Training and certification for auditing environmental management systems 

2. Environmental Management Systems Workshop (July, 2009) 
Thomson - Reuters 

Aim: Additional training in environmental management systems and auditing 

3. Integrated Management Systems Workshop (July, 2009) 
Thomson - Reuters 

Aim: Training in integrating environmental management systems into existing business 
structures 

4. NSW Enterprise Workshop (Winter Program 2008) 

Aim: To develop and defend a business plan, including an executive summary, 
marketing, operations, project management and a financial plan 

ADDITIONAL ACHIEVEMENTS: 

• Won a business plan award at the NSW Enterprise Workshop (Best Intellectual 
Property Plan, shared award with 2 other team members) 2008 

• Supervised an honours student who received the Deans Merit Award at the 

University of Technology, Sydney 2008 

• Jennings et al., 2005. was among the 20 most cited articles (position 1 1) of 2005, 
January - June. 

• Awarded a monetary prize for the best oral seminar at the inaugural Melbourne 
Protein Group Symposium. 2002. 

• Received a student bursary to attend the Forth Australian Peptide Conference 
(Lindeman Island, Queensland, Australia). 2001. 

• President of the La Trobe University Biochemistry Society. 2000 

• Accepted into the Golden Key Club for Scholastic achievements. 1997. 

• Accepted onto the Deans List for Scholastic achievements (La Trobe University, 
Melbourne, Australia. 1997. 

PATENTS: 

Novel nucleic acid molecules 

International application WO 0134829 filed 25th November 2000. 
Inventors: Craik, D.J., Anderson, M.A. and Jennings, C.V. 

Several Provisional patent applications written and filed 
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INVITED SEMINARS/ORAL PRESENTATIONS: 
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• Institute for the Biotechnology or Infectious Diseases (University of 
Technology Sydney) 2007 

Jennings C.V., Ahouidi A.D., Bei A., Sarr O., Ndir O., Wirth D., Mboup S., 
Duraisingh M.T 

• Biological Sciences and Public Health Retreat (Harvard University) 2005 
Jennings C.V., Ahouidi A.D., Bei A., Sarr O., Ndir O., Wirth D., Mboup S., 
Duraisingh M.T. 

• Biological Sciences and Public Health Retreat (Harvard University) 2003 

Jennings, C.V., Whitehurst, N., Desimone, A., Duraisingh, M.T. 

• PhD Qualification Seminar 2002 

Jennings C.V. Craik D.J., Anderson M.A 

• Inaugural Melbourne Protein Group Symposium 2002 

Jennings C.V., West J.A., Craik D.J., Anderson M.A. 

• PhD Conversion Seminar 2001 

Jennings C.V. Craik D.J., Anderson M.A 



RESEARCH PUBLICATIONS: 

Bushkin GG, Ratner DM, Cui J, Banerjee S, Duraisingh MT, Jennings CV, Dvorin JD, 
Gubbels MJ, Robertson SD, Steffen M, 0*Keefe BR, Robbins PW, Samuelson J. 
Eukaryot Cell. 2009 Oct 16. 

Desimone TM*, Jennings CV*, Bei AK, Comeaux C, Coleman BI, Refour P, Triglia T, 
Stubbs J, Cowman AF, Duraisingh MT. 
Mol Microbiol. 2009 EPub. Apr 14 
* Joint first authors 

Lantos PM, Ahouidi AD, Bei AK, Jennings CV, Sarr O, Ndir O, Wirth DF, Mboup S, 
Duraisingh MT. 

Parasitology. 2009 Jan;136(l):l-9. 

Desimone TM, Bei AK, Jennings CV, Duraisingh MT. 
Int J Parasitol. 2008 Mar; 3 9(4): 3 99-405. 

Duraisingh MT, DeSimone T, Jennings C, Refour P, Wu C 
Subcell Biochem 2008. :46-57. 
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Gillon AD, Saska I, Jennings CV, Guarino RF, Craik DJ, Anderson MA. 
Plant J. 2008 Feb;53(3):505-15. 

Jennings CV, Ahouidi AD, Zilversmit M, Bei A, Rayner J, Sarr O, Ndir 0, Wirth DF, 
Mboup S, Duraisingh MT. 

Infection and Immunity. 2007 Jul;75(7):3531-3538. 

Jennings C.V., Rosengren K.J., Daly N.L., Plan M., Stevens J., Scanlon M.J., Waine C, 
Norman D.G., Anderson M.A., Craik D.J. 
Biochemistry. 2005 Jan 25;44(3):851-860. 

Dutton J.L., Renda R.F., Waine C, Clark R.J., Daly NX., Jennings C.V., Anderson 
M.A., Craik D.J. 

J Biol Chem. 2004 Nov 5;279(45):46858-46867. 

Jennings C.V., West J., Waine C, Craik D., Anderson M. 
Proc. Natl. Acad. Sci. USA. 2001 Sep 11;98(19):10614-10619. 

Craik, D.J., Anderson, M.A., Barry, D.G., Clark, R.J., Daly, N.L., Jennings, C. V., 
Mulvenna, J. 

International Journal of Peptide Research and Therapeutics, 2001 May;8(3-5): 119-128. 
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Transposition of human immunoglobulin V x genes within 
the same chromosome and the mechanism of their 
amplification 



F.-J.Zimmer, H.Hameister 1 , H.Schek and 
H.Q.Zachau 

Institut filr Physiologische Chemie, Physikalische Biochemie und 
Zellbiologie der Universitat Munchen, 8000 Munchen 2 and 'Abteilung 
Klinische Genetik der Universitat Ulm. 7900 Ulm. FRG 
Communicated by H.G.Zachau 

The variable, joining and constant gene segments of the 
human immunoglobulin x locus (V»> J, and C„) are 
located on the short arm of chromosome 2 at 2pll-2pl2. 
Here we describe a cluster of 11 V, genes on the long 
arm of chromosome 2 at 2cen— qll. By pulsed-field gel 
electrophoresis, cosmid cloning and DNA sequencing the 
cluster was shown to consist of four amplified units 
(amplicons). The ampiicons, each 110—160 kb in size, 
are organized within 650 kb as an array of inverted 
repeats with short stretches of non-amplified DNA in 
between. Cloning and sequencing of three different joints 
between amplified and non-amplified DNA revealed the 
existence of parts of Alu repeats at each of the analysed 
joints. It is suggested that during evolution a group of 
five V» genes was transposed from the short to the long 
arm of chromosome 2 by a pericentric inversion. Three 
of the five V, genes were then amplified in two 
subsequent steps to yield the structure found in the 
majority of the present day population. The possible 
relation of this structure to a pericentric inversion of 
chromosome 2 that is seen cytogenetically in a small 
fraction of today's population is discussed. 
Key words: Alu repeats/amplification/inununoglobulin V, 
genes/transposition 



Introduction 

Amplification and transposition of genes piay an important 
role in the formation of multigene families during evolution 
(for review, see Maeda and Smithies, 1986). In the case of 
the human gene family coding for the variable regions of 
the immunoglobulin light chains of the x type (VJ, 
putative transpositions of V„ genes led to the formation of 
a 'mixed' gene cluster, in which genes of different subgroups 
are interdigitated (Pech and Zachau, 1984). A subsequent 
duplication led to the generation of a second copy of a large 
part of the V, gene cluster (for reviews, see Zachau, 1989, 
1990). The x locus contains -70 V, genes (for reviews, 
see Zachau, 1989, 1990) and is located on the short arm 
of chromosome 2, at 2pl2 (Malcolm et al., 1982). In 
addition, V, genes have been found on the chromosomes 
1 , 22 (Lotscher el al. , 1986, 1988) and other chromosomes 
(Straubinger et al., 1988). These genes are called orphons 
is analogy to histone and ribosomal RNA genes found outside 
of their respective gene clusters (Childs et al., 1981). 
We have previously reported the structure of two con- 



tiguous cloned regions (contigs), called Wa and Wb, with 
a total of nine V„ genes (Pohlenz et al. , 1987). The two W 
contigs have been assigned to chromosome 2 (Lotscher 
et al. , 1988) and were thought to be part of the x locus. 
However, we did not succeed in linking them to the cloned 
parts of the x locus by chromosomal walking or by pulsed- 
field gel electrophoresis (PFG). In the course of the PFG 
experiments, a third W contig was detected. The three 
contigs are present in the genomes of all individuals so far 
analysed. The observation that the W contigs are located on 
chromosome 2, yet are not part of the x locus, prompted 
us to analyse their genomic organization and chromosomal 
location in more detail. 

Results 

Characterization of the W contigs 
We previously described the characterization of the two 
contigs Wa and Wb (Pohlenz et al. , 1987). Here, we report 
the isolation and characterization of two sets of cosmid 
clones, one extending the contig Wb and one representing 
a new W contig, termed Wc. 

In the course of chromosomal walking experiments, the 
cosmid libraries III (Pohlenz et al., 1987) and V (Lorenz, 
1989) were screened with the W-specific clone m654-l 
(Pohlenz et al., 1987); 25 cosmid clones were isolated. Five 
of them were derived from Wa without extending the known 
contig. The other 20 clones belonged to Wb with some of 
them extending the contig. Two additional V„ genes were 
found in the extending cosmids. A map of the W contigs 
with some representative cosmid clones is shown in Figure 1 . 
A description of all clones can be found in Zimmer (1989). 

The existence of a third contig, Wc, was demonstrated 
on PFG blots that had been hybridized with the Wa-derived 
probe m 167-1 (Pohlenz et al., 1987; see also Figure 1). In 
subsequent restriction mapping of cosmid clones that had 
been isolated previously with ml67-l, but not fully 
characterized (Pohlenz, 1986), three clones matching neither 
Wa nor Wb were identified. Wc was shown to be a third 
independent contig and not an allelic variant of Wa or Wb 
by demonstrating the existence of Wa-, Wb- and Wc- 
characteristic fragments in the genomes of all 20 unrelated 
individuals tested (Zimmer, 1989; data not shown here). The 
map of Wc is included in Figure 1 . 

We use the term amplified unit or amplicon for those parts 
of the contigs Wa, Wb and Wc, that are homologous to each 
other. As Wb contains two amplified units, a minimum of 
four W amplicons (named I -IV; Figure 1) exist within the 
human genome. 

How many V y gene containing W amplicons exist 
in the human genome? 

To test whether more than three V, gene-containing 
amplicons exist, we estimated their copy number relative 
to C*, which is known to be single copy, by quantitation 
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Fig. 1. Restriction maps of the genomic contigs Wa, Wb and Wc. The maps were derived from cosmid clones described in Pohienz et at. (1987) 
and Zimmer (1989). For each contig only some representative cosmid clones are shown. The maps are aligned to demonstrate the homology between 
the contigs. Positions of restriction sites that are identical in two contigs or occur in contigs without counterpart are marked by vertical bars. 
Differences in restriction sites between two contigs are symbolized by arrows pointing to the respective map positions. In the map of Wc. restriction 
sites are given separately for the part that shows no homology to Wb or Wa. The scale (kb) applies to all contigs. A deletion of 1 kb in Wb at map 
position 92 kb is marked by a triangle. V, genes are drawn as filled boxes; the subgroup designations are indicated (I -III). Plasmid and MB 
subclones are shown as horizontal bars above the restriction map of Wa and underneath those of Wb and Wc. The subclones m654-l, ml65-5 and 
ml67-l are described in Pohienz et al. (1987), the remaining ones in Zimmer (1989). Amplified units are marked by fat lines and are numbered 
(I -IV). Clones derived from one amplicon hybridize to the corresponding position within other amplicons; these positions are marked by dotted 
lines. Arrows at the maps of Wa and Wb indicate the transcriptional orientation of the V, genes. A dot marks a Nrul site at map position 72 kb, 
which is present only in some alleles of Wa. 



of Southern blot hybridizations. We chose a clone which 
hybridizes to a position close to one of the amplified V„ 
genes (m654-l) and linked it to a fragment derived from 
the C k region (1-1). The construct m654-l/I-l is shown in 
Figure 2a and the blot hybridizations are in Figure 2b. The 
copy number of the W contigs is estimated as the multiple 
of the C x signal (Table I). From the calculated W/C, ratios 
it is very likely that no further amplicons hybridizing with 
m645-l exist. The procedure of copy number determination 
is similar to the one of Meindl (1990) who estimated the 
overall number of V x genes including W-type V, genes in 
the DNA of individuum AF, and arrived at data compatible 
with three copies of the W amplicon and one copy of the 
non-amplified gene W6. The cosmid clones constituting the 
W contigs of Figure 1 are derived from the DNA of 
individuum N and individuum St. Since the blot hybridiza- 
tions of restriction nuclease digests of the DNAs AF, N, 
St and PC-3 (see below) show no significant differences in 
the regions of the W-type V„ genes (Meindl, 1990) it is 
likely that the number of V„ gene-containing W amplicons 
is three. The amplified unit within Wc is not detected by 
hybridization with m654-l/I-l since it does not reach into 
the V„ gene-containing regions of the W amplicons 
(Figure 1). To identify other possibly existing truncated 
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amplicons and to elucidate the genomic organization of the 
known amplicons, we established a long-range map of all 
regions hybridizing with W-specific probes by PFG. 



Long-range map of the W regions 

The analysis of the organization of the W region was carried 
out with DNA from the prostate carcinoma cell line PC-3 
(Kaighn et al. , 1979), as PC-3 DNA seems to contain more 
unmethylated restriction sites than DNA from other sources 
(Lorenz etai, 1987), allowing the use of rnethylation 
sensitive restriction nucleases for long-range mapping. 
. A remarkable feature of the PFG blots is that all W region 
probes tested hybridize only to two Notl fragments, 250 and 
600 kb in size. V, gene probes hybridize mainly with two 
additional Notl fragments which contain the x locus (Lorenz 
et al. , 1987). The identification of Notl sites in cosmid clones 
of Wa and Wc, which are also present in PC-3 DNA (Notl 
sites at map positions 550 and 1150 kb of Figure 3), has 
been useful in the construction of the long-range map. 

The PFG studies did not reveal any evidence for the 
existence of additional W regions. According to the PFG 
experiments, the four W amplicons are organized as an array 
of large inverted repeats. 



Human immunoglobulin V, genes 



b 




Fig. 2. Determination of the number of W contigs. In (a) the construct 
m654-l/I-l, which was used for the hybridizations, is shown. A 
1.1 kb EcoRl-Sacl fragment from pi- 1, which is unique for the C, 
region (Klobeck et at. , 1984), has been cloned into the W-specific 
probe m654-l. Southern blot analyses with m654- 1/1-1 are shown in 
(b). Each lane contains 10 fig of placenta DNA digested with the 
indicated restriction nucleases. The W and C, derived fragments are 
assigned to the respective bands, the sizes of which are given in kb. 
For the quantitative evaluation see Table I. 



Cloning and sequence analyses of joints between 
amplified and non-amplified DNA 
Four joints between amplified and non-amplified DNA can 
be identified within the W regions by comparing the 
respective restriction maps (transitions between bars and lines 
in Figure 1). By cloning joint-containing fragments into 
plasmid (pl65-3, p654-3, p654-4) or M13 vectors (m 177-1, 
m 168-1) and comparing the fine maps, fragments suitable 
for the sequence analyses of the joints were identified and 
subcloned. 

The break-off in homology between Wa and Wb was 
localized by aligning the map of Wa with that of the 5' 
amplified unit of Wb (amplicon III) and comparing the 
restriction maps of pi 65-3 and p654-3. The joints of the 
amplicons II and III in Wb were localized in a similar manner 
by comparing the maps of p654-3 and p654-4. To localize 
the Wc joint, the maps of m 168-1 and m 177-1 were aligned 
(Figure 1). The sequences of the analysed joints are shown 
in Figure 4. It is evident from these data that Alu repeats 
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Figure 2b. 

•The values are corrected for blank values as described in Materials 

and methods. The standard deviations are indicated. 

'The copy number for C, is taken as 1.0. The W/C, ratios are 

calculated by dividing the c.p.m. of the W containing bands by the 

one for C„; the resulting numbers of W copies are given in 

parentheses. 

played a role in the recombination processes, which led to 
the formation of the novel joints. 

The IV contigs reside on the long ami of 
chromosome 2 

As we have not been able to link by chromosomal walking 
or PFG the W contigs to the x locus, we tried to determine 
the chromosomal location of W by analysing somatic cell 
hybrids. 

In one set of experiments, mouse -human hybrid cell 
lines, which contain only parts of human chromosome 2, 
were analysed with probes specific for the x locus or the 
W regions. In one of the lines, JI4-2L (Erikson et al. , 1983), 
only that part of the short arm of chromosome 2 is present 
that comprises C„ to the telomere (2pl2-tel). DNA of this 
cell line hybridizes only with a C„-specific probe but with 
neither probes derived from the V Y gene-containing parts 
of the locus nor the W region-specific probe m654-l 
(Zimmer, 1989). 

A second analysed cell line, RRP5-3 (Shiloh et al, 1985), 
represents the almost complementary situation to JI4-2L, as 
RRP5-3 contains a 2p~ chromosome, i.e. the long arm and 
only that part of the short arm between the centromere and 
the x locus in 2pl2. DNA of this cell line does not hybridize 
with any of the single-copy probes specific for the x locus; 
however, it gives a strong signal with m654-l (Zimmer, 
1989). According to these data, the W contigs are located 
either on the short arm of chromosome 2, between the x 
locus and the centromere, or on the long arm. 

A more precise localization of the W regions was achieved 
by in situ hybridizations; these results are shown in Figure 5. 
The x locus specific probe pC-2 (Klobeck et al. , 1984) maps 
to the short arm of chromosome 2 in the region 2pl 1 — 2pI2 
in accordance with the location of the x locus (Malcolm 
et al. , 1982; McBride et al. , 1982). The W-derived cosmid 
clones cosl78, cosl65 (Figure 1) and cosl40 (Pohlenz et al. , 
1987) as well as the subclone m654-l (data not shown) map 
to a region close to the centromere on the long arm of 
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Fig. 3. Long range map of the W conligs. The map was constructed on the basis of numerous PFG blots, most of which were hybridized 
consecutively with different probes in order to find out whether certain fragments are recognized by more than one probe |a list of fragments 
hybridizing with W-specific probes can be found in Zimmer (1989)]. The shown orientation of Wb is arbitrary, as there is no cloned restriction site 
within Wb which would allow one to determine its orientation. Wa and Wc were oriented by identification of restriction sites for Notl and Nrul in 
cosmids of the Wa contig, and of a Notl site in cosmids of the Wc contig. These sites are also present in DNA of PC-3 which is used in the PFG 
experiments (map positions 500, 550 and 1 150 kb). The distance between the Wa and Wc contigs is defined by the 600 kb Notl fragment which 
hybridized with ml71-3 and ml77-l. The contig Wb is placed in the centre of this fragment since we assume about equal sizes of the amplicons (see 
Discussion). The 500 and 700 kb Nrul fragments are defined by hydridization with m654-l, m!7t-3 and ml77-l. The amplicons I-IV are shown as 
black bars. Arrows indicate the amplicon orientation based on the transcriptional orientation of V, genes (I-UI); the orientation of amplicon IV is 
based on its restriction map which is homologous to thai of amplicon I (Figure 1). Filled and open triangles mark the regions to which the indicated 
probes hybridize. Open symbols indicate that the hybridizing region is not represented by cosmid clones. The terminal part of the map without 
known restriction sites is shortened (-//-). The analysed cell line PC-3 (Kaighn et al. , 1979) is heterozygous for the Nrul site marked by a 
rhombus. The linking of Wa to Wb and Wb to Wc is based on those restriction fragments marked with larger letters [for details see Zimmer 
(1989)1. 



chromosome 2 (2cen-qlt). This position is obviously 
distinct from the x locus. 



Discussion 

Size, copy number and organization of the W region 
amplicons 

According to the PFG experiments and the copy number 
determinations, four amplified units hybridizing with W 
region-specific probes reside within a DNA stretch of 650 kb 
and are organized as an array of inverted repeats. The cloned 
parts of the amplicons I— IV are 50— 100 kb each (Figure 1). 
If the length of non-amplified DNA between the amplicons 
I and II and between HI and IV (Figure 3) is similar to that 
between the amplicons II and HI, the average size of the 
amplicons is in the range 1 10- 160 kb. The chromosomal 
organization of the W amplicons, i.e. variable amplicon size, 
head-to-head or tail-to-tail orientation of the amplified units 
and stretches of non-amplified DNA in between two 
amplicons, is very similar to the organization of, for 
example, the amplified genes of dihydrofolate reductase 
(DHFR), CAD, c-myc and adenylate deaminase (AMPD) 
(DHFR, Ma et al. , 1988; Heartlein and Latt, 1989; CAD, 
Ardeshir et al., 1983; Ford and Fried, 1986; c-myc, Ford 
and Fried, 1986; AMPD, Hyrien etal., 1988). 

One major difference between the organization of the W 
regions and the amplified units reported in the literature is 
the low copy number and a rather small amplicon size. The 
amplicons described so far exist either in a low copy number, 
which is the case after first step selection, but have a size 
up to 10 000 kb, found for CAD amplicons (Giulotto et al. , 
1986) or the size is in the range of that of the W region 
amplicons, but the copy number reaches a value up to 2600 
1538 



copies per cell, as has been described for AMPD amplicons 
(Yeung etal., 1983). Whether these differences reflect a 
different mechanism of amplification responsible for the 
generation of the W regions is discussed in a following 
section. 



Novel joints are formed within Alu repeats 
Relatively few studies define the DNA sequences at the sites 
of recombination associated with amplification (for review, 
see Stark et al. , 1989). The sequence analyses of novel joints 
revealed the existence of partial Alu repeats. This finding 
makes it very likely that the repeats played a role either 
during the amplification process itself or in recombinations, 
which are necessary to resolve aberrant replication bubbles 
into a linear array of amplified units. Depending on the 
assumed mechanism of amplification (for review, see Stark 
et al., 1989), one can imagine several ways in which Alu 
repeats participate in the amplification process. In the context 
of strand switch models (Nalbantoglu and Meuth, 1986; 
Hyrien et al. , 1988), Alu repeats could serve as those 
sequences which promote the strand switch by the DNA 
polymerase within the replication fork. According to such 
a model, one would expect to find partial Alu repeats at the 
amplification joints if the strand switch event did not occur 
at identical positions on the leading and lagging strand. 

In the course of recombination, which is always associated 
with DNA amplification, Alu repeats might serve as target 
sites for the recombinations. The involvement of Alu repeats 
in genomic recombinations has been reported frequendy (see 
literature cited in Hyrien etal., 1987). 

Although we cannot decide at which step during the 
generation of the W amplicons the Alu repeats played a role, 
these repetitive sequence elements seemed to be important 
for the amplification of a W region precursor. 



Human immunoglobulin V, genes 



a)W* (I)* (mGGGSGCAGTGGCTCACGCCTGTAATCCCAG^ 
UB (III) GKTGGGAGCACTGulTCA^^ 



QGTGAAATCCCGTCTCTACTAAAAATTCAAAAATTM5CCGGTCTTI 

ffiTGAAATCCCGTCTclACTAAAAATACAAAAATT^ GGTGGTGGGTGCCTGTflGTCCCflGCTACTTAGGAGGCTGAGGCAGGAGAATCGTT 

.in i it miiimiiimi in 

GGTGAAACCCCGTCTCTACTAAMATACAAAAATTAG 



BCCTGTAATCCCAGCTACTCGGGAGGCTGAGGCAGGAGMTCGCT 



TGAACCCAGGAOTCAAffiTTGa^ 

iSEHESS 



b)3" M (It) GGKCAGGAAGCGCATCTTGGAGGGTCCAGGAAAACTGCTGCTTCTGCWTC^ 

ComAlu GGCTGGGCGTGGTGGCTWCGCCTGTAATCCCAGCACnTGG6AGGCCGAGGTGGGTGGATCA^ 

▼ 

J' M (11) CCAGAAGCGGGGCCCAATAAGCAC^ 
5* Wb (in)*CC/fiMGCGGr^ 

CohAlu GGTGAAACCCCGTCTCTACTAAAAATACAAAAATTAGCQBGCGTGGTGGCGCGCGCCTGTAATCCC^ 



3' WB (II) GCTCATTCTACCTrCAGCTGCCAGGAGGTGGTGGAGGTGATGnGTGCCTCAGGCCTCATAG 
ConAlu TIMACCCGGGAGGTGG^^ 



Wa (I) CTGCA(KTCCCA^CCCTGGGKG^^ l0 0 
CohAlu CCGAGTAKTGGGAnACAGGCGCGCGCCACCACGCCCGGCTAATTTTTGTATTTTTAGTAGAGACGGGGTTTCACM 

Wc (IV)* CCTTMCCACTKACACACATCCMCCTACCCCGTC^ 

W.U, CCTTCAX^GCACAC^^^ m 
ConAlu CTCCTGACCTCAfiGT«TCCAC<XACCTCGK^^ 

Wc ( IV)* TGCC^TGTGTGAGGCCTGGGCOVGGCTGCGaACW 

Wa (I) GCAACTTQGCAAAGTCTCAAGTTACAAMTTAATGTfiCAAAAA^ 500 

Fig. 4. Sequences of fragments, spanning joints between amplified and non-amplified DNA. The sequences are aligned to show maximal homology. 
Nucleotide positions winch are identical between two sequences are marked by vertical bars. The regions from which the sequences are derived are 
indicated. ConAlu is a consensus sequence of human Alu repeats (Kariya et al, 1987). The break-off in homology between two sequences is marked 
by a filled triangle. Those sequences which are believed to be the result of the recombinations associated with the generation of respective amplicons 
Crecombtned sequences') are marked by asterisks. The non-marked sequences have to be considered as 'reference sequences", which allow detection 
of the nucleotide position in the recombined sequence at which amplified and non-amplified DNA is joined. The sequences including some extensions 
that are not shown here are being transmitted to the EMBL Data Library. The sequencing strategies are described in Zimmer (1989). Both strands 
were sequenced in all cases. The sequence Wa in (a) is derived from a subcloned fragment of p!65-3. spanning the joint between amplicon I and 
non-amplified DNA; the 'reference sequence' Wb is derived from a subcloned fragment of p654-3 (Figure 1). In (b) the sequence 5' Wb is derived 
from a fragment that contains the border of amplicon III; it was subcloned from p654-3 (Figure 1; Zimmer. 1989). The reference sequence, 3' Wb, 
is derived from a fragment that contains the border of amplicon II; the fragment was subcloned from p654-4 (Figure 1). The sequence Wc in (c) is 
derived from a subcloned fragment of ml 77-1 which contains the break off in homology to ml 68-1 (Figure I). The boxed block of 56 nucle< ' " 
Wa shows 90% sequence identity to the consensus sequence of human Alu repeats: the 5' part of the complementary strand is shown. The A 
sequence ends at position 177. The sequence 3' of position 177 in Wa shows 80% sequence identity to human LINE repeats (Skowronski el 



The W regions: products of two successive 
amplifications 

The results of sequence comparisons of genes derived from 
amplicons I -III make it very likely that these three 
amplicons were generated at different times and, hence, in 



a multistep process during evolution (Zimmer et al. , 1990). 
According to these data, the amplicons II and EH, which form 
the large inverted repeat within Wb (Figure 1), were formed 
later in evolution than amplicon I. This conclusion is also 
supported by the results of restriction map comparisons of 
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cos178 165 140 pC-2 

ITfi- 5. Localization of (he W contigs by in situ hybridization. In (a) 
the grain distribution on chromosome 2 is shown; on the left side for 
(he W-derived cosmid clone cos 165 and on the right side for the C„ 
probe pC-2. For the probes cosl78, cos!40 and m654-l the same 
localization was found as shown for cosl65. In (b) arranged pairs of 
chromosome 2 are shown after hybridization with the indicated probes. 
Cosmid probes were detected by a non-radioactive procedure, pC-2 
and m6S4-l after radioactive labelling. 

amplicons I-m, which show that of 29 sites mapped within 
a stretch of 40 kb only three sites differ between amplicons 
II and HI. It is very likely that the three differences were 
created by a single event only, which led to a deletion of 
1 kb in amplicon U (Figure 1). More differences are found 
between amplicons II and I (10 out of 33 sites differ) and 
between III and I (seven out of 33 sites differ). The 
evolutionary relationship of amplicon IV to the other three 
amplicons is hard to prove with this strategy, as those parts 
of amplicons D and fH that are homologous to amplicon IV 
have not yet been cloned (Figure 1). 

A chain of events which could have led to the generation 
of the W regions during evolution is shown in Figure 6. One 
assumption of the scheme is that the W region precursor had 
a structure similar to that part of Wb which contains the genes 
W5 to W9. This is reasonable to assume, as the five genes 
have the same transcriptional orientation (Figure I), which 
also holds true for most genes of the x locus, from where 
the W precursor must be derived. In a first step, a section 
of - 150 kb with the genes W7, W8 and W9 is duplicated 
in such a way that a large palindrome is formed. The newly 
generated genes 7', 8' and 9' represent the genes Wl, W2 
and W3 of Wa. In a second step, which led to the duplication 
of a large block of DNA of -300 kb, a large palindrome 
is again formed. A third copy of the genes 7, 8 and 9 (7", 
8" and 9") is generated; these copies represent W4, W10 
and Wll of Wb. Amplicon IV is a copy of the 3' part of 
amplicon I, which has been formed in the first step. 




Fig. 6. Model of the generation of the W regions in a multislep 
process. The precursor contained five genes, marked by rilled dots. 
Arrows indicate transcriptional orientations of genes. The section 
which becomes duplicated in the course of the first amplification event 
and the generated amplicons are drawn as hatched arrows with the 
arrow-heads pointing to the 3' end with respect to the orientation of 
the genes. The brackets indicate that this structure does not exist in the 
genome anymore. That part of the intermediate product which is 
duplicated in a second event and the resulting amplicons are marked 
by dotted arrows. The four generated amplicons are termed I -IV in 
accordance with Figures 1 and 3. Genes generated in the first step are 
marked by '. those of the second step by ". The gene numbers of the 
present day W regions are given in parentheses. Parts of Alu repeats 
that have been detected at the joints are indicated. At the amplicon I 
joint, the 5' part of an Alu repeat is part of the amplicon (hatched 
rectangle), the deleted 3' part is marked by an open rectangle with a 
dotted line. At the joint of amplicon III. the 3' part of an Alu repeat 
is present (open rectangle); it is part of the non-amplified DNA 5' of 
amplicon III. At the amplicon IV joint only a few nucleotides of an 
Alu repeat are present (Figure 4c). The deleted 5', as well as 3', 
sequences are marked by open rectangles with dotted lines. The lower 
panel shows a long-range map of the W regions, similar to that shown 
in Figure 3. The open parts of the arrows, which indicate amplicons 
1 -IV. represent those amplified DNA stretches that have not yet been 



A remarkable feature of this series of amplifications is that 
in the first, as well as in the second, step only one additional 
copy is formed. This is in contrast to the generation of other 
amplified units, where the first step of selection yields only 
a few additional copies; second step selections, and, hence, 
secondary amplifications, however, usually result in high 
copy numbers (Saito et al, 1989). Alternatively, the low 
W copy numbers can be explained by secondary deletions 
of most of the W copies formed in the amplification events. 
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Transposition and amplification of the W regions 

For the generation of the W regions we propose the following 
chain of events: (i) the W region precursor, containing the 
genes W5 to W9 (Figure 6), was transposed to the long arm 
of chromosome 2 and (ii) the new chromosomal location of 
the precursor somehow promoted its stepwise amplification. 

A pericentric inversion, involving chromosomal bands 2ql 
and 2pl, could have been responsible for the transposition 
of the five V, genes from the x locus on the short arm to 
the long arm of the chromosome. Such a chromosomal 
rearrangement is observed at the evolutionary progenitor of 
the chimpanzee (Yunis and Prakash, 1982). Transposition 
of V, genes by such a process is consistent with a finding 
by Graninger et at. (1988), who detected another part of the 
x locus, a copy of the x-deleting element (xde; Siminovitch 
et al., 1985), on the long arm of chromosome 2 in 2qll 
(Graninger et al. , 1988). As 2ql 1 is the same band in which 
the W regions are located, one can speculate that the two 
regions, the W precursor and xde, were transposed to the 
long arm by the same process. Whereas a copy of the xde 
is still located 23 kb 3' of C„ (Klobeck and Zachau, 1986), 
we do not have any evidence that a copy of the W regions 
exists within the present day x locus. 

As an alternative to the transposition by a pericentric 
inversion, the V, genes could have been transposed via 
episomes. Such a mechanism has been proposed to be 
responsible for the amplification of DHFR and mdrl 
amplicons in some cell lines (Carroll et al. , 1988; Ruiz and 
Wahl, 1988). While an episome mediated mechanism seems 
likely for the orphon V„ genes on chromosomes 1 , 22 and 
other chromosomes, we prefer for the W regions the idea 
of a pericentric inversion. 

Data supporting the second assumption, i.e. the new 
chromosomal location promoted gene amplification, have 
been reported by Wahl etal., (1984), who found an 
influence of the chromosomal position of transfected CAD 
genes on the frequency of CAD amplification. 

For the amplification event itself, resulting in the formation 
of palindromic structures, mechanisms such as those recently 
reviewed by Stark etal. (1989) may be responsible. 
However, to explain the generation of only one additional 
copy per amplification step, one has to postulate that no 
replication processes yielding high copy numbers took place 
in the course of the W region amplification. 

Concluding remarks 

Pericentric inversions of chromosome 2 involving the 
chromosomal segments 2pll-2ql3 are observed by 
cytogenetic methods in about 0.1% of today's population 
(Djalali et al, 1986 and earlier literature). Also, de novo 
inversions have been observed at chromosome 2 (Vejerslev 
and Friedrich, 1984), indicating that the inversion is a 
spontaneous event. We consider the possibility that extensive 
sequence homologies between parts of the x locus on 2pl2 
and the x locus derived W and xde regions on 2qcen— ql 1 
promote the inversions. Interstitial telomere-like repeats, 
found in 2qll-2ql4 (Allshire etal., 1988), might also 
contribute to the spontaneously occurring pericentric 
inversions observed in the present day human population. 

One of the structural features of the W regions, i.e. the 
organization of different copies as large palindromes, is also 
found for the x locus itself (Lorenz etal, 1987). It is 
tempting to speculate that a duplication involving large parts 



of the x locus followed the same molecular mechanisms as 
the amplification of the transposed W precursor. Proving 
this hypothesis should be feasible by identifying and 
analysing the junctions between duplicated and non- 
duplicated sections, and by cloning the head-to-head junction 
of the two copies forming the x locus. 

Materials and methods 

Recombinant DNA and restriction maps 

The recombinant cosmids were isolated from libraries described by Pohlenz 
et al. (1987) and Lorenz (1989). Colony hybridization was performed as 
described previously (Klobeclc et al. , 1987). Restriction maps and subclones 
were constructed using established methods (Maniatis et al. , 1982). Cosmid 
clones were characterized as described in Pohlenz etal. (1987). 

DNA transfer was performed according to Reed and Mann (1985), except 
for PFG, where the protocol of Rigaud et aL ( 1987) was used. Final washing 
of filters after hybridization was at 68 °C with 40 mM phosphate, pH 7.2. 
1 % sodium dodecylsulphate. 

For copy number determinations of amplicons, the insert of clone 
m654-l/I-l (Figure 2a) was isolated as a 2.2 kb EcoRI-BamHl fragment 
on an agarose gel and labelled according to Feinberg and Vogelstein ( 1983). 
Digested genomic DNA (10 fig) was electrophoresed, transferred to filters 
and hybridized. After exposition for 1 day, bands were cut out. For blank 
values, filter strips above and beneath each band were used. The filter bound 
activities were measured in a liquid scintillation counter. 

PFG 

Long-range mapping of DNA from the prostate carcinoma cell line PC-3 
(Kaighn et ai, 1979) using rare cutting enzymes and orthogonal field gel 
electrophoresis was performed as described previously (Lorenz et al. , 1987). 

DNA sequencing 

For sequencing of M13-subclones the 'Sequenase' DNA sequencing kit (US 
Biochemical Corp., Cleveland, OH) was used according to the 
manufacturer's instructions. 

Chromosome banding and in situ hybridization 
Metaphase chromosomes were prepared from PHA-stimulated blood 
lymphocytes. The probes m654-l and pC-2 were radiolabelled with 
pHJdTTP and [ 3 H]dCTP, and hybridization was done as described in 
Adolph et al. (1987). A non-radioactive detection method was used for the 
localization of cosmid probes. The cosmid clones were linearized and labelled 
by the random priming technique with biotinylated dUTP-ll. Repetitive 
DNA sequences within the genomic insert were saturated by prehybridiza- 
tion with a 100-fold excess of human Cot-I DNA for 4 h (Landegent et al. , 
1987). The normal in situ hybridization was performed after prehybridization. 

For probe detection the slides must remain humid. They were rinsed in 
BT buffer (0.1 M sodium bicarbonate, 0.1% Triton-X-100 pH 8.0) and 
preincubated for 5 min in BT, 0. 1 % non-fat dry milk. The slides were treated 
with peroxidase labelled streptavidin (Enzo Biochemicals Inc., New York; 
20 fi\ per 450 pi BT) and probe detection was done with diaminobenzidine 
according to the protocol supplied by the manufacturer. The slides were 
counter-stained with methylene green. 
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ABSTRACT Six nonallelic immunoglobulin X constant 
region genes have been previously characterized on a 40- 
kilobase stretch of DNA. The nucleotide sequences of the three 
upstream genes of this cluster (C x i> C k 2, C x 3) have been 
determined by other workers und shown to encode, respective- 
ly, the isotypic Meg, Kern~~Oz~, and Kern~Oz + constant 
region of the X chains. In this paper, we report the sequence of 
the three downstream genes of this cluster and show that two 
of them (C x 4 and C x 5) are pseudogenes. However, C\6 encodes 
a Kern + Oz" chain and corresponds to the fourth isotype 
described among the X proteins sequenced so far. A potentially 
active / x (joining) segment, with the canonical heptamer and 
nonamer sequences for rearrangement, is located 1.5 kilobases 
upstream of C\6. The amino acid sequence encoded by the C x 6 
gene is compared with the constant region sequences of various 



MATERIALS AND METHODS 



differences confirm the polymorphism and complexity of the 



In humans, the constant (C) region of the immunoglobulin X 
light chains consists of at least four nonallelic or isotypic 
forms that differ by limited amino acid substitutions to 
produce the serological markers Kern (Ke) (1, 2), Oz (3-5), 
and Meg (6, 7). Several additional substitutions have been 
described (8-16), but it is unknown whether these represent 
allelic variants or distinct isotypes. The human immunoglob- 
ulin X light chain genes have been mapped to chromosome 22 
(17) at band qll (18, 19), and six nonallelic X C region genes 
Icj to C x 6") have been characterized on a 40-kilobase (kb) 
stretch of DNA (20). The number of C x genes varies between 
six and nine per haploid genome (21). These variations were 
detected by restriction fragment length polymorphism (21) 
and seem to have arisen from unequal meiotic crossing-over 
with a duplication of the C x 2 and C X J genes. Moreover, three 
additional C x -like genes have been recently identified, which 
map on different stretches of DNA and are nonallelic (22). 
One of these is a pseudogene, whereas the two others encode 
a putative X chain C region whose sequence differs from that 
of the X chains described so far. 

Only three C x genes (C x /, C x 2, and C K 3) belonging to the 
cluster described by Hieter have been sequenced (20), and 
they have been shown to encode, respectively, the Meg, 
Ke~Oz~, and Ke Oz + C region of the X chains. In this paper, 
we report the sequences* of the three genes located down- 
stream in this cluster and show that two of them (C K 4 and C X J) 
are pseudogenes, whereas C x 6 encodes a Ke + Oz~ chain, the 
fourth isotype described among the proteins sequenced so 
far. This C x <5 gene has a potentially active Adjoining region, 
with the canonical heptamer and nonamer sequences for 
rearrangement, 1.5 kb upstream of the coding C region. 

The publication costs of this article were defrayed in part by page charge 
payment. This article must therefore be hereby marked "advertisement" 
• :e with 18 U.S.C. §1734 solely to indicate this fact. 



n of a Phage Library from LY67 DNA. DNA 

prepared from LY67 cells (a X-producing Burkitt's lympho- 
ma) (23) was partially digested with Mbo I. Restriction 
fragments 15-20 kb long were ligated into Ba/nHI-digested 
DNA of phage X2001 (24) and packaged in vitro. Recombinant 
phages were screened by the in situ plaque hybridization 
procedure (25). 

Probes. A genomic clone (Chr 22X5) in Xgt-XWES (26) was 
kindly provided by T. H. Rabbitts (Medical Research Coun- 
cil, Cambridge, England). This clone contains an 8.0-kb 
EcoRl fragment that includes the known nonallelic Ke~Oz~ 
(C x 2) and Ke~Oz + (C X J) genes and the flanking sequences 
(20). We subcloned a 700-base-pair (bp) Bgl II-EcoRI frag- 
ment containing only the Ke~Oz~ C X J gene (Fig. 1), and this 
C x probe cross-hybridizes with all the other C x -like genes 
(20). It was radioactively labeled with [a- n P]dCTP by nick- 
translation (27) and was used to screen the LY67 phage 
library. 

Subcloning and Sequencing Strategies. One clone, LY67 
C x 3-6 (Fig. 1), was shown to contain C x j to C x 6. Appropriate 
subclones were made in pUC vectors (29). Nucleotide se- 
quence analysis was carried out by dideoxy chain-termina- 
tion procedures (30) in M13 vectors (31) by deploying 
exonuclease III-nuclease SI methods (32) or directed se- 
quencing using known restriction enzyme sites. 

Oligonucleotide Synthesis and Hybridization. A 19-mer 
oligonucleotide 5' GTGTTCGGCGGAGGGACCA 3' corre- 
sponding to part of the J k 3 gene segment sequence (this paper 
and ref. 28) was synthesized, radiolabeled, and hybridized to 
the LY67 C x 3-6 clone to search for other J x segments. Low- 
stringency washes were carried out at room temperature. 

RESULTS 

Rearrangement of a Vja Subgroup Gene to J k 3 in the LY67 
Cell Line. One clone (LY67 C x 3-6) containing a 18-kb piece 
of genomic DNA was isolated and characterized. A restric- 
tion map of this clone is shown in Fig. 1. Comparison of this 
map with one previously published (20) suggested that this 
clone contains four C x genes, namely C k 3 to C k 6. The 
sequence of the 5' end of the LY67 C x 3-6 clone shows that a 
V x gene rearrangement has occurred, joining this gene to the 
J k 3 gene segment, which is located 1.5 kb upstream of C x 3 
(28). Fig. 2A shows the partial nucleotide sequence of the 
rearranged V x in LY67 and that of a V x gene assigned to the 
V X III subgroup and isolated from the Burkitt lymphoma cell 
line PA682 (28). 

Abbreviations: C, constant; J, joining; V, variable; Ke, Kern. 
'The sequences reported in this paper are being deposited in the 
EMBL/GenBank data base (Bolt, Beranek, and Newman Labora- 
tories, Cambridge, MA, and Eur. Mol. Biol. Lab., Heidelberg) 
s. J03009 (C k 4), J03010 (C*5), and J03011 (C x 6)]. 
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FiG. 1. (A) Restriction map of LY67 Q3-6 clone. (B) Sequencing 
strategy. B, BamHl; Bg, Bgl II; H, ffindlll; R, £coRI; S, Sst I. Of 
the Pst I sites (P), only the one used for subcloning the fragment 
containing the C„6 gene is indicated. The rearrangement V-JJ is 
indicated by an arrow (V, variable region). An asterisk shows the 
location of a polymorphic flam HI site present in PA682 DN A (28) but 
absent from our L.Y67 clone. 



The deduced amino acid sequence of the rearranged V gene 
of LY67 is also compared to the V region of the protein DEL 
of the subgroup VJII (33, 34). A 75% sequence identity 
indicates that the V x gene rearranged in LY67 is a member of 
the V X IH subgroup gene family, and this is in agreement with 
the detection of a transcript hybridizing to a V X III probe in the 



u Asp Gin Asp Arg P 



a Gly Asp Glu Ala Asp PI 



" { GIG ACC GTC CIG GGT 

»{<L» ::::::::*" 



e Gly Gly Gly Thr Lys 



LY67 cell line (28). The J J segment of the LY67 C x 3-6 clone, 
compared to the segment rearranged in PA682, shows two 
nucleotide differences (one of them resulting in a valine/ 
leucine amino acid substitution) that may be due to allelic 
polymorphism. Two other nucleotide differences are ob- 
served at the V-J junction and are probably explained by a 
flexibility in the mechanism by which junctions occur (35, 
36). 

C x tf Encodes a Ke + Oz~ Chain. Fig. 2B shows the nucleotide 
sequence of the C x 6 gene and the encoded amino acid 
sequence (106 residues). The residues Ala, Ser, and Thr, 
found, respectively, at codons 6, 8, and 57 (positions 112, 
114, and 163 according to ref. 34) indicate that C x 6 encodes 
a Meg protein. Arg (codon 83, position 190) corresponds to 
the Oz~ marker, whereas Gly (codon 46, position 152) 
characterizes the Ke + marker. Therefore the C x <5 gene en- 
codes the fourth isotype Ke + Oz~. 

J k 6 Segment Is 1.5 kb Upstream of C X <S. Only the J K I (22) 
and J J segments (ref. 28 and this paper) have been charac- 
terized; they have been localized in genomic DNAs at 1.5 kb 
upstream of the respective C x coding regions. We therefore 
used an oligonucleotide corresponding to the J k 3 sequence 
(see Materials and Methods) to search for homologous J k 
segments in the LY67 C x 3-6 clone. As expected, a strong 
signal was obtained for the / x i-containing fragments, where- 
as a weaker signal allowed us to detect the J x 6 segment in a 
Sac l-Bgl II fragment upstream of C k 6. The sequence of this 
y x 6 segment (Fig. 2C) showed that it encodes 12 amino acids 
(among them the characteristic Phe-Gly-Xaa-Gly residues) 
and that it also possesses the canonical heptamer and 
nonamer sequences essential to V-J rearrangement (37, 38). 
No signal corresponding to the putative J x 4 and 7 X 5 segments 
could be detected in the LY67 C x 3-6 clone by using either the 
oligonucleotides (J k 3 probe) or the genomic Sac l-Bgl II 
fragment (J k 6 probe), indicating that if these segments exist 
their homology is too weak to be detected in our conditions 
of hybridization. Since the 7 x i and J k 6 gene segments are 1.5 
kb upstream of their respective coding regions, we subcloned 
fragments located, respectively, at about the same distance 
upstream of C K 4 and C X J. Although in both cases, we 
detected some conserved heptamer sequences, we did not 

B 



i GGT CAG CCC AAG GCt 
Gly Glu Pro Lys Ala 

ser ser Glu Glu Lau 


IgccIccaItcgIgtc act ctg ttc ccg ccc 
[Ala|pro|ser[val Thr Leu Pha Pro Pro 

CAA GCC AAC AAG GCC ACA CTG GTG TGC 

Gin Ala Asn Lys Ala Thr Leu Val Cys 


CTG ATC AGT GAC TIC 

AAG GCA GAllCGClACC 
Lys Ala AspjGlyJser 


TAC CCG GGA GCT GTG AAA GTG GCC TGG 
Tyr Pro Gly Ala Val Lys Val Ala Trp 

CCC GTC AA<? ACG GCA GTG GAG ACC ACC 


1 ACAICCC TCC AAA CAG 
[ThrJPro Ser Lys Gin 

TAC CTG AGC CTG ACG 
Tyr Leu Ser Lau Thr 


Pro Val Asn Ala Gly Val Glu Thr Thr 

AGC AAC AAC AAG TAC CCG GCC AGC AGC 
Ser Asn Asn Lys Tyr Ala Ala Ser Ser 

CCT GAG CAG TGG AAG* TCC CAcH^aIaGC 
■ Pro Glu Gin Trp Lys Ser HislArgjser 


TAC AGC TGC CAG GTC 
Tyr Ser Cys Gin Val 


: ACG CAT GAA GGG AGC ACC GTG GAG AAG 
, GAA TCT TCA TAG 



Fig. 2. Nucleotide and amino acid sequences. (A) Partial sequence of the LY67 V x -/J rearranged gene. (B) Sequence of the C K 6 gene. (C) 
Sequence alignment of J K 6 and J J (22). 



9076 Genetics: Dariavach el at. 



Proc. Natl. Acad. Sci. USA 84 (1987) 



0 



fcAs 

fu-i 



ft 



cAl OCC CCT ACA G. 



find the characteristic Phe-Giy-Xaa-Gly residues or the 
expected splice site at a downstream position. It is possible 
that the heptamer sequences are attached to poorly con- 
served pseudo J\ segments and we cannot exclude the 
possibility that the putative and J K 5 are localized in 



fragments that were not sequenced, upstream of the C x 4 and 
Cj,5 genes. 

Cat and C x 5 Are Pseudogenes. Nucleotide sequences and 
the encoded amino acid sequences of C x 4 and C x 5 are shown 
in Figs. 3 and 4. Both genes are pseudogenes; the third codon 
of Cy* is a stop codon, and CJ displays three deletions. The 
first deletion of 9 bp spans codons 5 to 7 and the other two 
deletions excise codons 21 and 64. C\5 has an 11-bp deletion 
(codons 41-44) resulting in a frameshift. 
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Fio. 5. Physical map of the human X light chain C region. C k l, C k 2, and CJ correspond, respectively, to the nonallelic Meg, Ke~Oz~, and 
Ke-Oz* chains (ref. 20; see Table 1). C k 4 and C k S are pseudogenes (0, whereas C k 6 encodes a Ke+Oz" chain. J J (22), J J (ref. 28 and this 
paper), and J k 6 (this paper) have been localized 1.5 kb upstream of CJ, C k 3, and C k 6, respectively. J k 2 has not yet been localized in genomic 
DNA. No J x gene segment has so far been identified upstream of C K 4 and CJ. 



DISCUSSION 

In human X chain C regions, the Meg marker involves amino 
acid residues at positions 112, 114, and 163 (numbering 
according to ref. 34; Table 1) corresponding, respectively, to 
codons 6, 8, and 57 of the C k genes. Mcg + proteins have 
residues Asn-112, Thr-114, and Lys-163, whereas Meg" 
proteins have residues Ala, Ser, and Thr, respectively, at 
these locations. However, the recently sequenced Mor pro- 
tein is different from the other Meg - proteins by having 
Ala-163 instead of Thr-163 (16). The Ke and Oz markers 
occur at positions 152 and 190, respectively: Ke + proteins 
have Gly-152 and Ke" have Ser-152; Oz + proteins have 
Ly s-190 and Oz" , Arg-190. These markers define four nonal- 
lelic forms of human X chain C regions (Meg, Ke"Oz~, 
Ke~Oz + , and Ke + Oz~), which are encoded, respectively, by 
the CJ, C k 2, C k 3 (20), and C k 6 (this paper) genes (Fig. 5). 

In Fig. 6, monoclonal Bence Jones proteins have been 
assigned as products of the C K genes 1, 2, 3, or 6 on the basis 
of the presence or absence of residues characteristic for the 
Meg, Ke, and Oz markers. In most cases, there is complete 
concordance between the protein and the deduced amino acid 
sequence of the corresponding C x gene. However, other 
amino acid changes have been found in several proteins 
(Table 2 and Fig. 6). Since these substitutions have been 
noted only once, they could represent allotypic differences. 
However, it is not excluded that some of the proteins 
Ke~Oz~ could be encoded by a C k gene resulting from the 
duplication of the C k 2-C k 3 region, as has been described in 
some individuals (21). In such cases these sequences should 
represent new isotypic differences due to the presence of 
several nonallelic copies of C k 2 gene. Differences observed 



Table 2. Sequence 



MOR 
ATK 
NIG68 



Residue numbers according to Rabat el al. (34). 

in the J k 2 sequences (Fig. 6) might for the same reason be 
either allotypic or isotypic. Differences in the J J segment 
region might represent allotypic differences, although the 
presence of not yet identified other J J segments cannot be 
ruled out. 

If we compare the deduced amino acid sequence of C k 6 
gene with two known Ke + Oz" proteins that have identical C k 
coding regions, SM (53) and Kern (52), the protein predicted 
for C%6 shows three differences: (i) lysine at position 145 
(codon 39) instead of threonine, (//) asparagine at position 156 
(codon 50) instead of lysine, and (Hi) alanine at position 212 



Fio. 6. Sequences of C regions of human X chains. The protein sequences encoded by the four "active" C k genes and associated J k gene 
segments are compared with the X protein C regions. The numbering of the C» region amino acids is according to ref. 34 — e.g., positions 169, 
201, and 202 are excluded in the C k sequences for purposes of alignment with human C, chains. For an easier alignment, the / x and C k gene 
segments are considered as being spliced and a vertical line is drawn to indicate the V-J junction. The horizontal line in the SM sequence shows 
a deletion. 
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(codon 103) instead of threonine (Fig. 6). These differences 
may represent allotypic variations, although we cannot 
entirely exclude the possibility that these different Ke + Oz~ 
sequences are encoded by nonallelic genes. More sequences 
of C x genes or X proteins should help estimate the extent of 
the human K chain polymorphism. 

Note Added in Proof. The J x 2 gene segment has recently been 
localized upstream of CJ (54). 
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ABSTRACT A comparison of five constant region se- 
quences of human and mouse x and X immunoglobulin 
chains has been undertaken in order to reveal sequence 
homologies and evolutionary relationships. Simultaneously, a 
comparison with the three-dimensional structure of one 
mouse ((-chain (McPC 603) has suggested structural reasons 
why many of the residues are invariant or conserved along x 
versus X fines. There are a number of residues that have re- 
mained invariant despite exposed positions for reasons that 
do not appear to be connected with the folding of this Cl do- 

The constant region (Cl domain) of immunoglobulin light 
(L) chains contributes substantially to the functioning of an 
immunoglobulin molecule. While it is not involved directly 
in the specificity and complementarity of the antibody com- 
bining sites, it is joined to its counterpart in the heavy (H) 
chain, the ChI domain, by a variety of noncovalent interac- 
tions as well as in most immunoglobulins by a disulfide bond 
at its C-terminal or subterminal Cys. This -S-S- bond is not 
essential to L-H association, which is maintained noncov- 
alently even after reduction and alkylation. In one immuno- 
globulin subclass of IgA2, the L-H bond does not occur, but 
an L-L dimer is formed, which remains noncovalently 
linked to the H chains. Bence Jones proteins in the form of 
L-L dimers also occur and may be held together by noncov- 
alent forces (1). 

There are two subclasses of light chains, x and X, which 
are present in almost all species examined; both are found in 
all five classes of immunoglobulins, IgG, IgM, IgA. IgD. and 
IgE, but each immunoglobulin molecule contains two iden- 
tical x or two identical X chains. 

The availability of the sequences of the C L domain of 
human x, human X, mouse x, and two mouse X light chains 
(2), together with x-ray data on the three-dimensional struc- 
ture of this domain (3-5) made it desirable to evaluate, if 
possible, structural influences on the evolution of these do- 

The present study attempts, residue by residue among 
these five chains, to relate preservation or variation of se- 
quence to structure and function as evaluated from a three- 
dimensional model of the Cl and its interactions with the 
ChI domain. The findings show some interesting stretches 
of sequence in which invariance predominates, and others in 
which evolutionary divergence has been essentially along x 



and X lines. Positions at the surface of protein molecules that 
are accessible to solvent may undergo many mutational 
changes which do not affect three-dimensional folding, 
while residues that are buried tend to be invariant or highly 
conserved (6). In Cl domains mutations involving residues 
that contact the ChI domain also tend to be restricted. 

The present study shows that residues preserved along x 
and X lines generally show conservative substitutions if in 
the interior of the domain or if buried. Fewer exposed resi- 
dues which have diverged along <c and X lines are homolo- 
gous. Certain residues remain invariant despite an essential- 
ly exposed position and for no obvious reason. Of special in- 
terest is the observation that at only two and four positions, 
respectively, were human x identical with human X and 
mouse x identical with mouse X, while human x and mouse x 
were identical at 29 and human X and mouse X at 39 posi- 



MATERIALS AND METHODS 
The model of the Fab fragment of mouse McPC 603 con- 
structed from x-ray data at 3.1 A resolution was used (5) as 
well as published information on the Cl regions of a Bence 
Jones dimer (4) and human Fab fragment (3). Sequences of 
human x, human X, mouse x, and two mouse X chains were 
available (2). These sequences were aligned for maximum 
structural and sequence homology from residues 101 to 215 
and modified to include the additional data reported (7). 
Each residue was located in the model of McPC 603 and 
classified according to whether it was: exposed, 0; mainly ex- 
posed, 1; partly exposed, partly buried, 2; mainly buried, 3; 
completely buried, 4; or in contact with ChI, C. In addition, 
each position was classified as: invariant; four of five chains 
and three of five chains identical; human x and human X 
identical; mouse x and mouse X identical; human and mouse 
x identical; human and mouse X identical; and human x, 
human X, mouse x and mouse X different. 



RESULTS 

Table 1 lists the sequences of the five chains from positions 
101 to 215. Above each residue is its classification from its 
position in the model. 

Fig 1 summarizes the sequence data in Table 1 with re- 
spect to the identities specified above. It is evident that clus- 
ters of invariant residues and those with 4/5 chains identical 
occur, notably at positions 118 to 123, 148 to 152 (excluding 
150), 176 to 182 (180 3/5 identical), and 194 to 200 (exclud- 
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Table 1. Sequences of the switch and constant regions of human and mouse immunoglobulin light chains and their location 
from the model of mouse k myeloma protein McPC 603 constructed from x-ray data 




C 1 C Oil 3 0 0 » 0 T 0 H 0 ; 0 0 0 1 00 I 0 t C 2CC TC2 10 




1 ^ 0 4 2 C 3 t i| C 1 £ 1 0 13 I 2 0 n 1 ^ 0 1 ! H 1 H 1 « 0 ^ 2 0 




Sequence data are from Gaily (2) with the mouse * chain data revised according to Svasti and Milstein (7). Residues at 169 to 177 have been 
realigned to remove the gap in all X chains at 178 and replace it by a gap at 169. Residues 201 and 202 have been moved to 203 and204, leav- 
ing a gap at 201 and 202. 



trig 195). There are also several clusters in which k versus X 
diversification predominates, 128 to 132 (130 invariant) and 
especially 164 to 172. 

Of the 114 residues considered, 28 were invariant, 17 and 
8, respectively, showed 4/5 and 3/5 chains identical. Of the 
kinds of identities of two chains in the remaining positions 
human k and human X were identical at only two positions 
and mouse k and mouse X at only four positions, while 
human and mouse k were the same at 29 positions and 
human X and one or both mouse X chains at 39 positions. At 
two positions a human k and a mouse X chain were the same 
(*) and at three positions human X and mouse jc chains had 
the same residue (•). At but eight positions were all four 
chains different 



Table 2 summarizes the sequence data in Fig. 1 in relation 
to the location of each residue in the model. The most strik- 
ing finding is that of the eight positions at which the four 
chains had different amino acids (Fig. 1), six residues were 
completely exposed to solvent and the remaining two were 
mainly exposed. Moreover, of the two positions at which 
human k and human X were the same (one of which was the 
Oz marker at position 190) and of the four positions at 
which mouse k and mouse X were identical, all but one were 
completely or mainly exposed. The sixth, residue 135, was a 
contacting residue adjacent to invariant Cys 134. Three of 
the five positions in which human and mouse identities oc- 
curred but which were not both k or both X as indicated by 
the symbols (* and •) in Fig. 1 were completely exposed; 
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FIG. 1. Distribution of identical residues in the switch and constant regions of human and mouse immunoglobulin light chains. The ar- 
rows indicate identical residues in two or more of the five chains as well as at positions at which the four classes of chains differed. When 
more than one arrow occurs at a given position, there was identity among two sets of chains. Thus, at position 169 human and mouse x chains 
had Lya while human X and both mouse X chains had a gap. Arrows with an * and a • indicate the few residues identical in human k and 
mouse X and human X and mouse *, respectively. A dashed arrow indicates an Asx or Glx at that position in one or more chains. 
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Table 2. Location from x-ray structure and degree of evolutionary preservation of residues in the Cj, domain and switch, 
region of human and mouse light chains 
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these included the Inv marker 191. Position 152, which in 
human X chains carries the Kern marker, was also exposed 
(3). 

Of the 28 invariant residues 18 were contact residues or 
were mainly or completely buried, while nine were mainly 
or completely exposed. Considering together the 25 positions 
in which 4/5 and 3/5 residues were identical, 12 were com- 
pletely or mainly exposed, only one was a contact residue, 
while five were completely buried. 

Of the 29 positions at which human k and mouse k had the 
same amino acid and the six (footnote Table 2) at which 
human X and one of the mouse X chains had the same amino 
acid, seven in each were contacting residues and three and 
six, respectively, were completely buried while 15 and 20, 
respectively, were completely or mainly exposed. 

Of the seven invariant exposed residues six were charged, 
2 Glu, 1 Asp, and 3 Lys; the seventh was Cys 214 which 
forms the -S-S- bond to the H-chain or in some cases to an- 
other L-chain. The location of these six residues and the two 
mainly exposed residues His and Thr in the model provides 
no insight into the basis for their invariance. 

The remaining 19 invariant residues, those partly or com- 
pletely buried and the contacting residues, included 12 
strongly hydrophobic residues, 1 Trp, 2 Phe, 4 Pro, 3 Leu, 
and 2 Vai; the remaining residues were Cys 134 and Cys 
194, which form the domain S-S bond, two less strongly hy- 
drophobic residues Ala and Thr, and three Ser. 

Examining the residues with 4/5 and 3/5 identities, the 
partly, completely buried, and contacting residues were also 
overwhelmingly hydrophobic, consisting of 2 Tyr, 5 Val, 1 
Thr, and 1 Ala, the others being i Gly and 1 His and the gap 
at position 201. The nonidentical residues at these positions 
were also hydrophobic in almost all cases, involving replace- 
ments of 2 He, 2 Leu, and Ala for the five Val residues and 
of Phe for one Tyr; the remaining Tyr 192 was replaced by 
Ser in both mouse X chains, the Thr by Ser or His, the Ala by 



Asx, and the Gly by Thr. There is obviously an extraordinary 
preservation of structure in terms of these groups of resi- 

Among the 15 and 16 residues that have diverged along k 
versus X lines and which are partly, mainly, and completely 
buried or are contacting residues, eight are identical pairs; 
three of these involve conservative hydrophobic substitu- 
tions. He- Leu, Val-Leu, and Leu-Ile, at positions 117, 132, 
and 136, respectively, involving mainly and completely bur- 
ied residues; the remaining pairs are the substitutions Thr- 
Lys at 172, Glii-Glu 124, Ser-Thr 131, Glu-Ser 165, and 
Thr-Tyr 178, the last four being contacting residues, The 
presence in k chains of an additional residue at 169 causes 
Thr 172 to be completely buried in the mouse C. domain; in 
the X chains, Lys 172 is probably completely exposed to sol- 
vent. The unpaired residues are 1 Phe, 1 Tyr, 2 Ser, 1 Asp, 
and 1 Thr in the *, and 1 Ala, 1 Val, 1 Gin, 1 Glu, 1 Lys, and 
3 Thr in the X group. 

The k versus X differences show a predominance of com- 
pletely and mainly exposed residues with 14 of 29 and 20 of 
36 residues in k and X, respectively, falling into this group; 

The region 101-108 consists of two invariant residues, 101 
and 102, followed by six positions in which residues with 4/5 
identities alternate with residues which have evolved along k 
and X lines, including the gap at position 108; four of the po- 
sitions are completely exposed, one is partly exposed, and 
two are completely buried. Arg 107 marks the end of the 
mouse V, domain; C, starts with Ala 109. The additional res- 
idue at position 108 in X chains could be accommodated by 
a hairpin bend facilitated by Gly 107 or by Pro 109. 

There is a cluster of invariant residues from 118 to 123 
consisting of three contacting residues, 118, 119, and 121, 
one partly buried 120, and two exposed residues, 121 Ser 
(4/5) with an Asp alternative, and an invariant Glu 123. The 
region 124-140 contains some invariants but is largely made 
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up of residues which have evolved along k and X lines; resi- 
dues 130-137 consist exclusively of contacting and com- 
pletely buried residues of which three are invariant, in an- 
other three k and X differ, and in the others more variation 
has occurred. At 135 human k and X have Leu while mouse k 
has Phe and mouse X Thr and at position 137 human and 
mouse k have Asn, human X Ser, and mouse X Thr. 

The most striking region which has been preserved along 
k versus X lines, 160 to 175, has five contacting, one com- 
pletely and four partly buried residues, and two mainly and 
three completely exposed residues. It is followed by a largely 
invariant cluster; 176 to 181, of contacting and buried resi- 
dues. The remainder of the molecule is mainly exposed ex- 
cept for the region around Cys 194, in which buried and 
mainly exposed residues alternate and there is no clustering 
of invariant or k versus X residues. 



DISCUSSION 

The general structural relationships described for the Cl re- 
gion of human and mouse < and X chains are an initial at- 
tempt to understand the basis for the evolutionary preserva- 
tion of certain regions as essentially invariant and the preser- 
vation of others along k versus X lines, an evolutionary diver- 
gence which took place about 200 million years ago (8). The 
principles established from sequence and structural findings 
on other proteins generally apply equally well to the immu- 
noglobulins. Thus the buried residues and those contacting 
the heavy chain tend to be largely invariant and hydropho- 
bic, while residues that are exposed or mainly exposed may 
vary, and are generally polar. These also include the few 
residues that have evolved along human versus mouse lines. 
There are, however, substantial numbers of hydrophobic 
residues that are invariant or identical in 4/5 or 3/5 chains 
and that occur in regions of the molecule for which no ob- 
vious structural basis for their conservation may be assigned. 

The marked clustering of residues which have been pre- 
served along it versus X lines in certain portions of the chain, 
most notably at positions 160 to 175, or have been main- 
tained invariant, such as 118 to 123, suggests that these may 
have unique functions. 

The Cl domain has been extraordinarily preserved once * 
versus A diversification occurred, since at only eight posi- 



tions were all four chains different and at only five other po- 
sitions were human and mouse chains identical despite a x 
versus X difference (Fig. 1). 

The immunoglobulin findings were compared with values 
for human and mouse hemoglobins, using the a chain as 
equivalent to k and the jS (y) chain equivalent to X (8). Since 
hemoglobin chains are longer, the data were normalized to 
115 residues. The findings for hemoglobin are strikingly dif- 
ferent from the immunoglobulin results. Thirty-six residues 
were invariant, and 4/5 and 3/5 chains were identical at 12 
and at 49 positions, respectively. Human and mouse a and 
human and mouse 0 were identical at 13 and 11 positions, 
respectively. At only one position each were mouse a and 
mouse 0, human a and mouse 0, and human 0 and mouse a 
identical There were no positions at which all four chains 
differ. Thus there were 97 residues in which three or more 
chains were identical in hemoglobins in contrast to 53 such 
residues in the immunoglobulins. This comparison indicates 
a higher degree of evolutionary conservation of residues that 
did not differentiate along a versus 0 lines than residues that 
differentiated along * versus X lines. The immunoglobulin 
Cl domain thus shows a higher degree of evolutionary 
adaptability than do the hemoglobins. 
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Abstract 

Completion of the X-ray analysis of the human B7-15A2 Fab opened a new vista (Immunotechnology 3, no. 4). 
In the crystal lattice, both the /l-type light chain (C L domain) and y l-type heavy chain (C H 1 domain) participated in 
formation of antiparallel /? -pleated sheets with neighboring molecules related to the reference Fab by 2-fold axes. This 
observation evoked memories of the first description of this type of packing for human Bence-Jones (X chain) dimers 
20 years ago (Ely K.R. et al. Biochemistry 1978;17:158-167). Reexamination of packing interactions in selected 
crystal systems revealed that the C domains of X and y 1 chains were structurally amenable to the formation of such 
cross-molecule /? -structures, but k chain C L domains were not. In the latter, a single proline residue disrupted the 
order of § -strand 3-3 in the middle of the surface used in X and y\ chains for intermolecular interactions with 
symmetry-related molecules. For the packing of Fv molecules, the V L domains are structurally well suited for 
analogous packing interactions through antiparallel 4- 1 /? -strands in adjacent molecules. Such interactions have been 
shown to provide the driving force in the crystal packing of a human (Pot) Fv from an IgM-K cryoglobulin. Together, 
these observations suggest several avenues through which propensity to crystallize can be programmed into the 
designs of synthetic human Fabs, Fvs and single-chain antibodies. © 1998 Elsevier Science B.V. 

Keywords: Human B7-15A2; /? -pleated sheets; Bence-Jones dimers; Fv molecules 
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1. Introduction 

One of the most fertile sources of information 
about macromolecular interactions can be found 
amongst protein constituents orderly arranged in 
crystal lattices. This resource is regularly being 
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upgraded through an ever increasing number of 
structures solved by X-ray analysis. Symmetry 
operations in most of the commonly appearing 
space groups afford exciting opportunities to 
observe groups of these molecules in a variety of 
combinations and orientations within the lat- 
tice. Taking advantage of such opportunities is 
made possible by the powerful computer-as- 
sisted graphics workstations and sophisticated 
software that are currently available. 

A particularly elegant stacking pattern was ob- 
served in crystals of an Fab from a human 
IgGl-i antibody (B7-15A2) against tetanus tox- 
oid. The key design element in these crystals is 
a continuous ribbon of Fabs stabilized through 
interlocking, antiparallel six-stranded /? -pleated 
sheets between the C L and C H 1 domains of 
symmetry-related molecules. Each Fab is bound 
to two other molecules, one on each side, and 
this process is repeated to produce a ribbon of 
indefinite length. When correlated with informa- 
tion previously accumulated for human Bence- 
Jones (L chain) dimers and other Fabs, the 
crystal packing of B7-15A2 suggests that the 
propensity to crystallize can be programmed 
into the design of some synthetic Fabs. 



2. Bilateral formation of ^-pleated sheets across 
2-fold axes 

A stereo diagram illustrating the self assem- 
bly of B7-15A2 Fabs into a trimeric packing 
unit in the crystal lattice is presented in the top 
panel of Fig. 1. Main chain atoms of the outer 
strand (3-3) of the three-chain antiparallel /?- 
pleated sheet in the C L domain of one Fab 
(e.g. central molecule in Fig. 1) form hydrogen 
bonds across a crystallographic 2-fold axis with 
constituents of y9 -strand 3-3 in the C„l 
domain of a symmetry-related Fab (on the right). 
On the opposite (left) side of the reference 
molecule, the interactions are repeated in re- 
verse as C H 1 interacts with the C L domain in a 
third Fab. 

Bilateral, six-stranded /? -sheet structures 
across crystallographic dyads were first de- 
scribed for a human (Meg) Bence-Jones dimer 



[1]. They were subsequently observed in the X- 
ray analysis of another (Loc) /t-type Bence- 
Jones protein [2], This type of packing is 
shown in the lower panel of Fig. 1 for three 
adjacent subunits in the ribbon of Meg dimers. 
Models are displayed in the same format and 
orientation to facilitate comparisons of the B7- 
15A2 and Meg ribbons. 



3. Formation of ^-pleated sheets by C L or C H 1, 
but not both 

A six-stranded /? -pleated sheet is also found 
in crystals of a heterologous dimer of two hu- 
man /t-chains (Meg and Hud), but the interac- 
tions are not bilateral (Luke Guddat and Allen 
Edmundson, unpublished results). Instead, 
cross-molecule hydrogen bonding occurs only 
between C L domains of the Meg subunits. 
Therefore, the principal packing motif is a 
dimer of the hybrid and not a continuous rib- 
bon. These relationships are illustrated in Fig. 
2. A similar pattern is observed in crystals of 
the Fab New, where the six-stranded /? -sheet 
spans antiparallel C H 1 domains [3]. 

4. Cross-molecule fi -sheet produced by K-type V L 
domains 

Currently, there is only one known example 
of V domains engaging in this type of packing. 
In crystals of an Fv (V L -V H combination) 
from a human (Pot) IgM cryoglobulin, each k- 
type V L domain interacts with an antiparallel 
V L domain across a 2-fold axis [4]. Unlike the 
C L and C H 1 packing patterns, the N-terminal 
/? -strand 4-1 of V L is the principal interacting 
unit in the Pot Fv. As shown in Fig. 3, the 
key packing step is again the formation of a 
dimer rather than a ribbon. This type of pack- 
ing leads to a crystal with a high fractional 
volume of protein in the unit cell (very advan- 
tageous in X-ray analysis). For instance, the 
protein: solvent ratio in the Pot Fv crystals was 
0.64:0.36, compared with 0.50:0.50 in the B7- 
15A2 Fab and 0.40: 0.60 in the Meg L chain 
dimer. 
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5. Structural features favoring /?-sheet formation 

If Fv fragments are excluded, bilateral partici- 
pation in cross-molecule /? -sheet structures seems 
to be primarily dependent on the relative orienta- 
tions of C L and C„l. Among human Bence-Jones 
proteins and human Fabs, the components most 
likely to manifest such favorable properties are C L 
domains from A-type light chains, alone or in 
combination with yl heavy chains. In these do- 
mains the 3-3 p -strands are often held in accessi- 
ble and ordered positions by quite regular 
hydrogen bonding with 3-2 /?-strands along their 
inner surfaces. On their outer surfaces, the back- 
bone oxygen and nitrogen atoms not involved in 
such intradomain interactions are free to form 
hydrogen bonds with the corresponding //-strands 
in symmetry-related molecules. 

The portions of the 3-3 /^-strands conducive 
to docking are comparable in human X and 
yl chains. They involve the first six residues 
following the turn between /? -strands 3-2 
and 3-3 and have the sequences Gly-Ser-Thr- 
Val-Glu-Lys in the /t-type L chains (residues 
203-208 in computer numbering; no allowances 
for gaps or insertions) and Ser-Asn-Thr-Lys- 
Val-Asp (residues 207-212) in the yl H chains. 
There are four intermolecular hydrogen bonds 
formed between the backbone atoms of anti- 
parallel 3-3 //-strands in the Meg dimer (both 
L chains), the Meg x Hud hybrid (Meg sub- 
units only), the B7-15A2 Fab (L and H chains) 
and the Fab New (H chains only). The pattern in 
Meg is simple to understand and will be consid- 
ered first. 



6. Cross-molecule hydrogen bonding in crystals of 
the Meg dimer 

Val 206 is located at the center of the interac- 
tions. Hydrogen bonds 1 and 2 are formed be- 
tween the main chain amide group and carbonyl 
oxygen of Thr L205 of monomer 1 and the car- 
bonyl oxygen and amide group of Glu 207 of 
monomer 2. In a reciprocal manner, the amide 
and carbonyl groups of Thr L205 in monomer 2 
interact with the carbonyl and amide groups of 



Glu L207 in monomer 1 to form hydrogen bonds 
3 and 4. There is a fifth hydrogen bond between 
the side chain hydroxyl group of Ser 204 in 
monomer 2 and the £-amino group of Lys 208 
monomer 1. Because of local asymmetry in the 
conformations, only one hydrogen bond is possi- 
ble because the side chains of Ser 204 in monomer 
1 and Lys 208 in monomer 2 are not sufficiently 
close to interact. 



7. Hydrogen bonding across 2-fold axes in the 
Meg x Hud hybrid 

It is interesting that only the Meg monomer 
participates in the hydrogen bonding of the 
Meg x Hud hybrid, despite the fact that the 
amino acid sequences of the two ^-chains are 
identical in the 3-3 /?-strands and surrounding 
regions of the C L domains. Again, the pattern of 
four hydrogen bonds in the L205-207 central re- 
gion is repeated, as shown in Fig. 2. 



8, Intermolecular hydrogen bonding between C L 
and C„l in B7-15A2 

There is a slight variation in the B7-15A2 Fab 
(see Fig. 1, top panel). At one end of //-strand 
3-3, the oxygen atom of Gly L203 makes a 
hydrogen bond with the amide group of Asp 
H212 (position analogous to position L207). In 
the central region, the amide nitrogen and car- 
bonyl oxygen of Thr L205 form hydrogen bonds 
with the oxygen and nitrogen atoms of Lys H210 
(position analogous to L205). At the distal end 
the amide group of Glu L207 hydrogen bonds 
with Asn H208 (analogous to residue L203). 

9. Hydrogen bonding between antiparallel C„l do- 
mains in Fab New 

In Fab New [3] there are four hydrogen bonds 
involving Lys H210 and Asp H212 of two C H 1 
domains (not shown). It is significant that these 
residues in the 3-3 /J-strands of both New and 
B7-15A2 H chains are separated from the inter- 
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Fig. 4 
Figs. 1-4. 
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chain (L-H) disulfide bond by the same distance 
as the comparable residues in the A -type L chains 
of B7-15A2, New and Meg. 

10. Hydrogen bonding between symmetry-related 
K-type V L domains 

As in strand 3-3 of C L (see later section), 
strand 4-1 of the Pot V L becomes ordered and 
relatively straight after a disruptive proline 
(residue 8). This ordered region consists of 
residues 9-13 (Ser-Ser-Leu-Ser-Ala), which 
are held in position on their inner surfaces by five 
hydrogen bonds with a parallel /S-strand (5-5): 
Ser 9, Leu 11 and Ala 13 interact with residues 
Lys 103, Asp 105 and Lys 107. Along the outer 
surfaces of this strand, at least five carbonyl and 
amide backbone groups are appropriately posi- 



tioned for intermolecular, antiparallel P -sheet 
formation with segment L9-13 in an adjacent Fv. 
In the center there are two hydrogen bonds (3.2 
A) between the backbone oxygen atoms of Ser 
L10 of Fv molecules 1 and 2 and the amide 
groups of Ser LI 2 of Fv molecules 2 and 1. 
However, the other two combinations of atoms 
are too far apart (3.6 A) to complete the expected 
set of four hydrogen bonds. A stereo dia- 
gram illustrating these relationships is presented 
in Fig. 3. 

11. Programming crystallization into the design of 
some Fabs 

It may be possible to design human Fabs with 
features predisposing them to crystallize in accord 
with the packing principles discussed above. Five 



Fig. 1. Top panel: Stereo diagram of three human B7-15A2 Fabs taken from a ribbon of molecules in the crystal lattice. The models 
were constructed and displayed with the program molmol [10]. V domains are facing upward and the C domains downward in the 
outer two molecules. In the middle Fab the C domains are at the top. Polypeptide backbone is represented by tubular segments, 
yellow for the X -type light (L) chain and steel blue for the yl-type heavy (H) chain. Strands of y?-pleated sheets are shown as 
directional arrows superimposed on the polypeptide backbone. In the L chain, the 5-stranded layer in V L and the 3-stranded layer 
in C L are colored magenta; the four chain layers in V L and C L are in cyan. In the H chain, the 5- and 3-stranded layers are white 
and the 4-stranded layers are red. Hydrogen bonds stabilizing the packing of antiparallel molecules are represented by four purple 
rods between /?-strands 3-3 of C H 1 and C L (white to magenta on the right; magenta to white on the left). This intermolecular 
pattern of hydrogen bonds is repeated indefinitely along the ribbon in the crystal. Lower panel: Stereo diagram of three human 
(Meg) A-type Bence- Jones (L chain) dimers, excerpted as above from a ribbon of molecules in the crystal lattice. The monomers 
have the same amino acid sequence, but one behaves structurally as a heavy chain analog [1 1]. General orientations and color codes 
are the same as those in the upper panel. Intermolecular interactions include four hydrogen bonds on each side of the dimer, but 
in a slightly different distribution from that seen in B7-15A2 (see text and Table 1). 

Fig. 2. Stereo diagram of a dimer of the Meg x Hud hybrid in the crystal lattice [12-1 4]. Meg plays the structural role of the L chain 
and Hud is the H chain analog (both are X chains). Note that the four intermolecular hydrogen bonds involve only the C L domains 
of the Meg component (purple rods crossing the crystallographic 2-fold axis between the two magenta 3-3 £ -strands). 
Fig. 3. Stereo diagram of a dimer of a human (Pot) Fv from an IgM-/c cryoglobulin. The K-type V L domain is represented by a 
yellow tubular model, with a cyan 4-chain layer and a magenta 5-chain layer. V H is steel blue, with red (4-stranded) and white 
(5-chain) ^-pleated sheets. The intermolecular hydrogen bonding interactions across a 2-fold axis involve the L chains only. Note 
that only the middle two sets of backbone atoms in the 4-1 fi -strands are sufficiently close to fit the distance criteria (separation 
< 3.2 A apart) for hydrogen bonds. The outer two sets of atoms, which are in the correct orientations for hydrogen bonds but 3.6 
A apart, are shown as small purple knobs. 

Fig, 4. Schematic representations of the last two strands of ^-pleated sheets and their adjoining segments in the C domains of X, 
yl and k chains, $ -strand 3-2 is on the right and £ -strand 3-3 is on the left of each drawing. Hydrogen bonds stabilizing the 
pleated sheets along their internal surfaces are depicted as ladders of purple rods between strands 3-2 and 3-3. Triads of amino 
acid residues involved in packing interactions of X and y 1 chains are listed next to their contributions to the external hydrogen 
bonds. In k chains, a proline residue in a sequence position corresponding to threonine in X chains and lysine in y 1 chains interrupts 
the ^-structure of strand 3-3. This disturbance leads to enlargement of the loop between strands 3-2 and 3-3 and a diminution 
or loss of the capacity to form ^-pleated sheets with symmetry-related molecules. In X chains, the connection between strands 3-2 
and 3-3 is a narrow and regular ^-turn. Two additional residues in the sequence of CHI in yl chains (Table 1) are accommodated 
by formation of a single turn of 3, 0 helix at the crown of the loop. Thus the connection between strands 3-2 and 3-3 is kept 
relatively compact in yl chains. In contrast to k chains, the X and yl turns do not interfere with the propensity of the ensuing 
^-strand to pack with its neighbors in the crystal lattice. 
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human immunoglobulin fragments crystallizing 
primarily by antiparallel stacking of C domains 
(Meg, Loc and Meg x Hud L chain dimers; New 
and B7-15A2 Fabs) all have .4 -type L chains, 
alone or in combination with y-type H chains of 
subclass 1. It therefore seems prudent to empha- 
size this combination of isotypes in the initial 
attempts to program crystallization propensity. 

A partial list of structural features making the X 
and yl isotypes so favorable for crystallization 
can be assembled from the previous discussions. 
(1) The 3-3 ^-strands in the C domains are held 
in appropriate orientations by quite regular intra- 
domain hydrogen bonding. (2) Immediately fol- 
lowing the turn between the 3-2 and 3-3 strands, 
the carbonyl oxygen atoms of the first residue 
(i) in 3-3 (Gly 203 in C L and Asn 208 in C H 1) 
point outward for interactions with a docking 
molecule. (3) Backbone amide and carbonyl 
groups of residues i + 2 and / + 4 are also readily 
available for intermolecular hydrogen bonding. 
Among the known examples, they are the most 
frequently used entities for such purposes (e.g. 
Thr 205 and Glu 207 in C L ; Lys 210 and Asp 212 
in C H 1). (4) Although there are small variations 
among the participants, four intermolecular hy- 
drogen bonds between backbone atoms have been 
recorded in all five proteins of the X and/or yl 
isotypes. Interactions between side chains may 
add one-two hydrogen bonds to this count. 

(5) Side chains of the 3-3 /? -strands do not inter- 
fere with the docking of the macromolecules. 

(6) The sets of i, i + 2 and i + 4 residues in the L 
and H chains are equidistant from the L-H inter- 
chain disulfide bond [9], which anchors the 3-3 

-strands and essentially terminates the Fab. This 
arrangement places the L and H y9-strands in 
register for typical antiparallel hydrogen bonding. 

(7) Turns connecting the 3-2 and 3-3 P -strands 
are quite regular in the two isotypes and conse- 
quently do not significantly distort the 3-3 dock- 
ing surfaces. 

12. Exceptions to the rules: C L domains of K-type 
L chains 

In two murine Fabs with k chains (4-4-20 anti- 



fiuorescyl and BV04-01 anti-ss-DNA antibodies), 
as well as the Human Pot Fv, we found that the 
L chains dominated the crystal packing but not 
by the mechanisms emphasized here [4-8]. Frag- 
ments with /c-type L chains are still missing from 
our preliminary list of proteins predisposed to 
crystallize by C domain stacking. A partial expla- 
nation may lie in the structure of the turn between 
/? -strands 3-2 and 3-3 of a K-type C L domain in 
a humanized (AF2) Fab (Z.-C. Fan and A.B. 
Edmundson, unpublished results). This turn is 
distinctly different from those of X and y 1 chains 
of B7-15A2, as illustrated in Fig. 4. The more 
regular turns of X and y 1 chains will be discussed 
first. 



13. Structures of turns between 3-2 and 3-3 in 
the B7-15A2 Fab 

Amino acid sequences of both the human k and 
yl chains have two more residues than X chains 
between the intrachain and interchain disulfide 
bonds [9], as shown in Table 1. A gap is intro- 
duced into the X chain sequence to indicate the 
locations of these extra residues, which are proba- 
bly the ones that expand the ^-turn in the X chain 
to a distorted helix in the yl chain. 

On the amino side of the turn (strand 3-2), all 
three polypeptide chains are very similar (see Fig. 
4). In the X chains of both B7-15A2 and Meg, the 
connecting segment between 3-2 and 3-3 is a 
compact type II /?-turn of four residues (His- 
Glu-Gly-Ser, nos. 201-204). This turn is kept 
narrow and well ordered by two hydrogen bonds 
between His 201 and Ser 204 (/, i + 3) at its base. 



Table 1 

COOH-terminal amino acid sequences of human X, yl and k 



CQVTHE GSTVEKTVAPTECS X Chain 

CNVNHKPSNTKVDKRVEPKSC yl Chain 

CEVTHQGLSSPVTKSFNRGE C k Chain 



The abbreviations are presented in the one-letter code. Sym- 
bols in boldface type represent residues involved in packing 
interactions in X and yl chains, and their counterparts in k 
chains. 
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Moreover, this arrangement insures a stable plat- 
form for the carbonyl oxygen atom of Gly 203 to 
participate in external hydrogen bonding with 
another L or H chain. 

In the B7-15A2 H chain, the turn is more 
irregular and slightly wider than its counterpart in 
the X chain. For instance, the loop is stabilized at 
its base by a single hydrogen bond between His 
210 and Thr 215 (i, i + 5), which demarcate where 
/?-strand 3-2 ends and 3-3 begins. The crown of 
the loop (Lys-Pro-Ser-Asn, nos. 21 1-214) flares 
outward between these two positions to form a 
single turn of a distorted 3,„ helical structure, with 
hydrogen bonding between Lys 211 and Asn 214 
(see crest of the loop in Fig. 4). At the end of the 
loop, the backbone carbonyl group of Asn 214 is 
in an orientation suitable for intermolecular hy- 
drogen boding. Structurally, Asn 214 plays a roll 
like that of Gly 203 of the X chain. 



14. Structure of the k chain loop between 
/J-strands 3-2 and 3-3 

Two more residues are taken from the generic 
strand 3-3 and added to produce the k chain 
loop, which then has a combined sequence of 
His - Gin - Gly -Leu - Ser- Ser-Pro - Val (residues 
198-205). Thus the X, yl and k loops progress in 
length from four to six to eight residues. The 
conformation of yS-strand 3-2 of the k chain is 
similar to that of the yl chain through the His 
residue. After the turn, the k polypeptide chain 
comes back into register with its y 1 counterpart at 
Val 205. From Gin 199 to Pro 204 the segments 
are very different, as is evident from Fig. 4. 

Note that the loop is considerably broader and 
more irregular than those of the X and y 1 chains. 
Moreover, residues predicted to be potential con- 
tact sites for other L or H chains in the crystal 
lattice (e.g. Gly 200, Ser 202 and Pro 204) are 
constituents of the looped out structure, rather 
than the relatively straight part of 0 -strand 3-3. 

It is now clear that the ordered (/? -strand) 
portion of 3-3 does not begin in a k chain until 
Val 205, which corresponds structurally to the 
center of the crystal packing surfaces of the X and 
yl chains. The residue most responsible for this 



anomaly is Pro 204, which not only alters the 
direction of the chain but also disrupts the hydro- 
gen bonding propensity (only the carbonyl oxygen 
of proline can participate). 

By sequence criteria, Ser 202, Pro 204 and Thr 
206 in a a: chain are equivalent to X constituents 
Gly 203, Thr 205 and Glu 207, the three most 
commonly used residues for stacking interactions 
in crystals. Because of the complementarity re- 
quirements, we suspect that the surface on the 
carboxyl side of Pro 204 would not be sufficiently 
large or topographically adequate to bind a C L or 
C H 1 domain in a symmetry-related Fab. 



15. Usage of amino acid triads in C L -C L or 
C L -C H 1 packing 

-strand 3-3 begins with the carbonyl group of 
glycine in X chains (at the end of the gap in the 
sequences given in Table 1) and with asparagine 
in a comparable position in yl chains. These are 
the first of three residues (boldface type in Table 
1) involved in cross-molecule hydrogen bonding 
in human X and yl chains. Corresponding 
residues in the k chain are also in boldface. As 
emphasized earlier, the first two are constituents 
of the turn between strands 3-2 and 3-3. 

Note that the residues in the packing triads of X 
and yl chains are in register relative to the inter- 
chain disulfide bond and are therefore easy to 
stack in an antiparallel fashion. The first residue 
in the triad only supplies its carbonyl group to the 
intermolecular interactions but the backbone 
amide and carbonyl groups of the other two 
members are fully accessible. 

When inserted into the middle of what would 
otherwise be a packing triad in C L of a k chain, 
proline disrupts the type of ordered local structure 
seen in the X and y 1 chains. The relatively straight 
stretch of the 3-3 fi -strand is thereby converted 
into part of a wide turn less suitable for antiparal- 
lel docking of another C domain. A proline 
residue also interrupts the order of ^-strand 4-1 
in the Pot V L domain. Because of these disruptive 
effects, as well as changes resulting from the 
parallel pattern of internal hydrogen bonding, the 
packing triad for the Pot Fv is Ser 9, Ser 10 and 
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Ser 12, rather than the more regular /, i+2, 
i + 4 arrangement of residues in C L and C H I. 



16. Conclusion 

For human antibodies with X and yl chains, it 
seems straightforward to program a crystalliza- 
tion tendency into the design of synthetic Fabs. 
We can follow the leads provided by X-ray 
analyses of Fabs from naturally occurring anti- 
bodies. In these molecules, the 3-3 /?-strands 
and the turns between strands 3-2 and 3-3 in 
both C L and C H 1 are highly conserved in length 
and sequence. 

From a three-dimensional viewpoint, crystal 
packing of C L and C H 1 seems to be more de- 
pendent on the presence of a highly ordered 3-3 
P -strand than on the actual sequence of side 
chains. For example, the packing triads consist 
of Gly, Thr and Glu residues in the X chains 
and Asn, Lys and Asp residues in the yl chains. 
As a generalization, the side chains do not inter- 
fere with the packing interactions and may even 
enhance them (e.g. Meg L chain). However, a 
single substitution (proline) in the triad region of 
a k chain can disrupt the structural order of the 
3-3 p -strand and render it unsuitable for stack- 
ing with another C L or C H 1. It would be inter- 
esting to see if the structure of the 3-3 strand 
could be significantly altered by replacement of 
the proline. 

In Fabs with yl chains, the last six residues 
(including the interchain Cys) of the H chain are 
encoded by the 'hinge' region gene. Different 
subclasses of human and murine IgG antibodies 
(as well as IgM and IgA immunoglobulins) have 
distinct hinge regions with respect to disulfide 
patterns, sequences and length. During the early 
tests of the ideas presented here, it is therefore 
desirable to program yl-type hinge regions into 
synthetic Fabs. This decision will help insure 
that the packing triads of C L and C H 1 are kept 
in register relative to the interchain disulfide 
bond. 

Fabs with /c-type L chains are not expected to 
crystallize as ribbons but there is no a priori 
reason why their H chain (yl) components 



should not pack as dimers. For Fvs with k 
chains (e.g. Pot), dimerization through the for- 
mation of cross molecule /9-sheet structures can 
produce high quality crystals with favorable 
protein to solvent ratios. 

In the Pot Fv, side chains are involved in the 
cross-molecule interactions and the sequence in 
this part of framework region 1 (FR1) is there- 
fore of considerable importance. Of the three 
subgroups of human k chains [9], only subgroup 
II has an unfavorable sequence (a proline in po- 
sition 12) for the Pot type of packing. Five side 
chains of strand 4-1 in Pot are in contact with 
a symmetry related L chain and it seems re- 
markable that they all dovetail properly. 

This information should be of substantial 
value in the synthesis and crystallization of syn- 
thetic Fvs and single chain antibodies with re- 
type L chains. In our limited data base, the k 
chains are also prominent in controlling the 
packing of larger antibody fragments, although 
not as yet in predictable ways. Both X and yl 
chains have naturally occurring features that 
need little embellishment to make them self as- 
semble into highly ordered patterns. These pat- 
terns give great delight to contemporary 
crystallographers and molecular biologists who 
are trying to extend the new field of structure 
based design to the production of synthetic anti- 
bodies. 
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To facilitate understanding of, and access to, the information available for 
protein structures, we have constructed the Structural Classification of 
Proteins (scop) database. This database provides a detailed and com- 
prehensive description of the structural and evolutionary relationships of 
the proteins of known structure. It also provides for each entry links to 
co-ordinates, images of the structure, interactive viewers, sequence data and 
literature references. Two search facilities are available. The homology search 
permits users to enter a sequence and obtain a list of any structures to which 
it has significant levels of sequence similarity The key word search finds, for 
a word entered by the user, matches from both the text of the scop database 
and the headers of Brookhaven Protein Databank structure files. The 
database is freely accessible on World Wide Web (WWW) with an entry point 
to URL http://scop.mrc-lmb.cam.ac.uk/scop/ 

scop: an old English poet or minstrel (Oxford English Dictionary); 
aeon: pile, accumulation (Russian Dictionary). 
Keywords: protein families; superfamilies; folds; evolutionary 
relationships 



Nearly all proteins have structural similarities 
with other proteins and, in many cases, share a 
common evolutionary origin. The knowledge of 
these relationships makes important contributions to 
molecular biology and to other related areas of 
science. It is central to our understanding of the 
structure and evolution of proteins. It will play an 
important role in the interpretation of the sequences 
produced by the genome projects and, therefore, in 
understanding the evolution of development. 

The recent exponential growth in the number of 
proteins whose structures have been determined by 
X-ray crystallography and NMR spectroscopy 
means that there is now a large and rapidly growing 
corpus of information available. At present (January 
1995) the Brookhaven Protein Databank (PDB, 
(Abola er al., 1987)) contains 3091 entries and the 
number is increasing by about 100 a month. To 
facilitate the understanding of, and access to, this 
information, we have constructed the Structural 
Classification of Proteins (scop) database. This 
database provides a detailed and comprehensive 
description of the structural and evolutionary 
relationships of proteins whose three-dimensional 
structures have been determined. It includes all 



Abbreviations used: PDB, Protein Databank; scop, 
Structural Classification of Proteins. 



proteins in the current version of the PDB and 
almost all proteins for which structures have been 
published but whose co-ordinates are not available 
from the PDB. 

The classification of protein structures in the 
database is based on evolutionary relationships and 
on the principles that govern their three-dimensional 
structure. Early work on protein structures showed 
that there are striking regularities in the ways in 
which secondary structures are assembled (Levitt 
& Chothia, 1976; Chothia et al., 1977) and in the 
topologies of the polypeptide chains (Richardson, 
1976, 1977; Sternberg & Thornton, 1976). These 
regularities arise from the intrinsic physical and 
chemical properties of proteins (Chothia, 1984; 
Finkelstein & Ptitsyn, 1987) and provide the basis for 
the classification of protein folds (Levitt & Chothia, 
1976; Richardson, 1981). This early work has been 
taken further in more recent papers; see, for example, 
Holm & Sander (1993), Orengo er al. (1993), 
Overington et al. (1993) and Yee & Dill (1993). An 
extensive bibliography of papers on the classification 
and the determinants of protein folds is given in scop. 

The method used to construct the protein 
classification in scop is essentially the visual 
inspection and comparison of structures though 
various automatic tools are used to make the task 
manageable and help provide generality Given the 
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Figure 1. In scop, the unit of classification is usually the 
protein domain. Small proteins, and most of those of 
medium size, have a single domain and are, therefore, 
treated as a whole. The domains in large proteins are 
usually classified individually The protein entries in the 
December 1994 of the Brookhaven Protein Databank (PDB) 
contain 3179 domains. Many of these become forms of the 
same protein whose differences are not significant in terms 
of the classification used here; for example they have 
different bound ligands or engineered mutations. To 
distinguish between these and structures of the same 
protein from different organisms, proteins listed within a 
family are subclassified by species. Classification of the 
3179 domains show that they come from 498 families that 
can be clustered into 366 superfamilies and 279 different 
folds. In addition to these, scop contains entries for 195 
proteins that do not have atomic co-ordinates available 
from the PDB at present but for which description of their 
structures have been published. 



current limitations of purely automatic procedures, 
we believe this approach produces the most 
accurate and useful results. The unit of classifi- 
cation is usually the protein domain. Small 
proteins, and most of those of medium size, have 
a single domain and are, therefore, treated as a 
whole. The domains in large proteins are usually 
classified individually 

The classification is on hierarchical levels that 
embody the evolutionary and structural relation- 
ships. 

FAMILY. Proteins are clustered together into 
families on the basis of one of two criteria that imply 
their having a common evolutionary origin: first, all 
proteins that have residue identities of 30% and 
greater; second, proteins with lower sequence 



identities but whose functions and structures are 
very similar; for example, globins with sequence 
identities of 15%. 

SUPERFAMILY. Families, whose proteins have 
low sequence identities but whose structures and, in 
many cases, functional features suggest that a 
common evolutionary origin is probable, are placed 
together in superfamilies; for example, actin, the 
ATPase domain of the heat-shock protein and 
hexokinase (Flaherty et a}., 1991). 

COMMON FOLD. Superfamilies and families are 
defined as having a common fold if their proteins 
have same major secondary structures in same 
arrangement with the same topological connections. 
In scop we give for each fold short descriptions of its 
main structural features. Different proteins with the 
same fold usually have peripheral elements of 
secondary structure and turn regions that differ in 
size and conformation and, in the more divergent 
cases, these differing regions may form half or more 
of each structure. For proteins placed together in the 
same fold category the structural similarities 
probably arise from the physics and chemistry of 
proteins favouring certain packing arrangements and 
chain topologies (see above). There may, however, 
be cases where a common evolutionary origin is 
obscured by the extent of the divergence in sequence, 
structure and function. In these cases, it is possible 
that the discovery of new structures, with folds 
between those of the previously known structures, 
will make clear their common evolutionary relation- 
ship. 

CLASS. For convenience of users, the different 
folds have been grouped into classes. Most of the 
folds are assigned to one of the five structural classes 
on the basis of the secondary structures of which 
they composed: (1) all alpha (for proteins whose 
structure is essentially formed by cx-helices), (2) all 
beta (for those whose structure is essentially formed 
by p-sheets), (3) alpha and beta (for proteins with 
a-helices and p-strands that are largely inter- 
spersed), (4) alpha plus beta (for those in which 
a-helices and P-strands are largely segregated) and 
(5) multi-domain (for those with domains of different 
fold and for which no homologues are known at 
present). Note that we do not use Greek characters 
in scop because they are not accessible to all world 
wide web viewers. More unusual proteins, pep- 
tides and the PDB entries for designed proteins, 



Table 1 



Facilities and datab 


tses to which SCOP has links 




Link 




URL 


Reference 


Co-ordinates 


PDB 


http://www.pdb.bnl.gov/ 


(Abola ef a/., 1987) 


Static images 


SP3D 


http://expasy.hcuge.ch/ 


(Appel eta/., 1994) 






gopher://pdb.pdb.bnl.gov/ 


On-the-fly images 


NIH molecular 


http://www.nih.gov/www94/molrus 


(FitzGerald. 1994) 




modelling group 






Sequences and 


NCBI Entrez 




(Benson et a/., 1993) 


MEDLINE entries 







The scop database contains links to a number of other facilities and databases in the world. Several interactive viewers 
can be linked with scop using PDB co-ordinates. The location and nature of the links will vary as databases evolve and 
relocate. 
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WWW browser (NCSA Mosaic) lU 




Figure 2. A typical scop session is shown on a unix workstation. A scop page, of the Interleukin 8-like family is displayed 
by the WWW browser program (NCSA Mosaic) (Schatz & Hardin, 1994). Navigating through the tree structure is accomplished 
by selecting any underlined entry by clicking on buttons (at the top of each page) and by keyword searching (at the bottom 
of each page). The static image comparing two proteins in this family was downloaded by clicking on the icon indicated 
and is displayed by image-viewer program xv. By clicking on one of the green icons, commands were sent to a molecular 
viewer program (RasMof) written by Roger Sayle (Sayle, 1994), instructing it to automatically display the relevant PDB file 
and colour the domain in question by secondary structure. Since sending large PDB files over the network can be slow, 
this feature of scop can be configured to use local copies of PDB files if they are available. Equivalent WWW browsers, 
image-display programs and molecular viewers are also available free for Windows-PC and Macintosh platforms. 
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theoretical models, nucleic acids and carbohydrates, 
have been assigned to other classes. 

The number of entries, families, superfamilies and 
common folds in the current version of scop are 
shown in Figure 1. The exact position of boundaries 
between family superfamily and fold are, to some 
degree, subjective. However, because all proteins 
that could conceivably belong to a family or 
superfamily are clustered together in the encom- 
passing fold category, some users may wish to 
concentrate on this part of the database. 

In addition to the information on structural and 
evolutionary relationships, each entry (for which 
co-ordinates are available) has links to images of the 
structure, interactive molecular viewers, the atomic 
co-ordinates, sequence data and homologues and 
MEDLINE abstracts (see Table 1). 

Two search facilities are available in scop. The 
homology search permits users to enter a sequence 
and obtain a list of any structures to which it has 
significant levels of sequence similarity The key 
word search finds, for a word entered by the user, 
matches from both the text of the scop database and 
the headers of Brookhaven Protein Databank 
structure files. 

To provide easy and broad access, we have made 
the scop database available as a set of tightly coupled 
hypertext pages on the world wide web (WWW). 
This allows it to be accessed by any machine on the 
internet (including Macintoshes, PCs and work- 
stations) using free WWW reader programs, such as 
Mosaic (Schatz & Hardin, 1994). Once such a 
program has been started, it is necessary only to 
"open" URL: 

http://scop.mrc-lmb.cam.ac.uk/scop/ 

to obtain the "home" page level of the database. 

In Figure 2 we show a typical page from the 
database. Each page has buttons to go back to the 
top-level home page, to send electronic mail to the 
authors, and to retrieve a detailed help page. 
Navigating through the tree structure is simple; 
selecting any entry retrieves the appropriate page. In 
addition, buttons make it possible to move within the 
hierarchy in other manners, such as "upwards" to 
obtain broader levels of classification. 

The scop database was originally created as a 
tool for understanding protein evolution through 
sequence-structure relationships and determining if 
new sequences and new structures are related to 
previously known protein structures. On a more 
general level, the highest levels of classification 
provide an overview of the diversity of protein 
structures now known and would be appropriate 
both for researchers and students. The specific lower 
levels should be helpful for comparing individual 
structures with their evolutionary and structurally 
related counterparts. In addition, we have also found 
that the search capabilities with easy access to data 
and images make scop a powerful general-purpose 
interface to the PDB. 

As new structures are released by PDB and 
published, they will be entered in scop and revised 



versions of the database will be made available on 
WWW. Moreover, as our formal understanding of 
relationships between structure, sequence function 
and evolution grows, it will be embodied in 
additional facilities in the database. 
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Background: Peptostreptococcus magnus protein L 
(PpL) Is a mulHdomain, bacterial surface protein whose 
presence correlates with virulence. It consists of up to 



chains found on two thirds of mammalian antibodies. 

Results: We refined the crystal structure of the complex 
between a human antibody Fab fragment (2A2) and a 
single PpL domain (61 residues) to 2.7 A. The asymmet- 
ric unit contains two Fab molecules sandwiching a single 



gions of two light chains via independent i 



residues contacted on V t are remote from the hypervari- 
able loops. One PpL-V* interface agrees with previous 
biochemical data, while the second Is novel. Site- 



studies suggest that the two PpL binding sites have 
markedly different affinities for V u . The PpL residues In 
both Interactions are well conserved among different 
Peptostreptococcus magnus strains. The Fab contact 
positions identified in the complex explain the high 
specificity of PpL for antibodies with kappa rather than 



Conclusions: The PpL-Fab complex shows the first in- 
- in of a bacterial virulence factor with a Fab light 




Peptostreptococcus magnus protein L (PpL: whole pro- 
tein L. PpL domain: individual domain of protein L) Is a 
cell wall-anchored protein able to interact with a large 
repertoire of mammalian immunoglobulins (Ig) [1]. 
Staphylococcus aureus protein A (SpA) [2] and strepto- 
coccal protein G (SpG) [3] also share this property, al- 
though they recognize different Ig regions. PpL is pres- 
ent at the surface of about 1 0% of Peptostreptococcus 
magnus strains [4]. It is a 76-1 06 kDa protein containing 
fourorfrve highly homologous, consecutive extracellular 
Ig binding domains (depending on the bacterial strain 
from which it is isolated pi, 6]). The structure of PpL 
domain B, (76 amino acids) has been determined by 
NMR spectroscopy [7]. The fold of this domain is similar 
to that of the SpG Ig binding domains. It consists of a 
3 sheet composed of two pairs of anti-parallel p strands 
and an a helix that lies on top of the sheet 

PpL interacts with Ig light chains [1], notably with the 
kappa light-chain variable region (VJ from humans and 
other mammals [8, 9], When present on the bacterial 
surface, PpL has been described as a virulence factor 
of bacterial vaginosis In different clinical specimens [4]. 
It has also been shown that PpL induces histamine re- 
lease by basophils and mast cells, presumably by cross- 
linking IgE bound to Fee receptors (10, 11]. 

PpL and SpA single domains are targets for protein 
engineering due to their ability to bind the variable re- 
gions (Fv) of a large population of antibodies. This 



and affinity by phage display and other in vitro evolution- 
ary mutagenesis techniques (1 2). The structural charac- 
terization of PpL and SpA binding properties Is useful 
for defining the spectrum of Fvs, which bind to these 
domains, based on their sequences. 

We report In this study the crystal structure of the 
human antibody Fab 2A2 complexed through its V L re- 
gion to a PpL domain. The same Fab was the subject of 
a previous crystallographic study describing its complex 
with a Staphylococcus aureus protein A (SpA) domain, 
which binds to the antibody V„ region [13]. Here, we 
compare the location of these two Ig binding domains 
on either side of the Fv region of Fab 2A2 in relation to 
the antigen-combining site. Unexpectedly, we find that 



there are two PpL-Fab Interfaces, such that one PpL 
domain has two separate regions that can interact with 
kappa light chains and Is capable of binding two Fabs 
simultaneously. The two Interfaces involve similar sites 
on the V t domains. One of the PpL-Fab interfaces con- 
forms to previous biochemical data, while the second 
Is novel. Site-directed mutagenesis and analytical -cen- 
trifugation studies suggest that the two PpL combining 
regions have markedly different affinities for V u . The high 
specificity of PpL for kappa (the largest mammalian V L 
gene family), rather than lambda chains, is discussed in 
light of the structure of the complex. We analyze the 
sequence diversity of the PpL domains at positions in- 
volved in the interaction with V L regions and discuss 
avidity effects reported between whole PpL and Ig. Fur- 
thermore, by comparing the interactions of PpL and SpQ 
with V L and C„1 , respectively, we note that at a structural 
level, the Fab binding modes of these two bacterial 
proteins show a degree of convergence. 



The First V L -PpL Interface 

The asymmetric unit has one PpL domain and two Fab 
molecules. The PpL domain is in close contact with the 
V L region from both Fabs (Figure 1a). This stoichlometry 
was unexpected because no previous studies raised 
any evidence for one PpL domain being able to complex 
to two V L regions simultaneously [6, 1 4]. 

The first interface buries a total solvent-accessible 
area of 1300 A" with approximately equal contributions 
from both molecules, as determined with a 1 .4 A radius 
probe, and is remote from the Ig heavy chain. There is 
no conformational change In the backbone of either 
partner upon binding. The PpL domain C* used In this 
study has 94% sequence identity with the C4 domain 
[1 5]. Its affinity for kappa chains is comparable (1 30 nM) 
to the one measured for PpL domain B1 [1 4, 1 5J. Domain 
C* maintains the same three-dimensional structure as 
PpL domain B1 determined by NMFt [7], and the two 
structures superpose with an rmsd of 1.31 A over 59 
residues. The complexed PpL domain C* superposes 
with an rmsd of 0.39 A over 61 residues of the crystal lo- 
graphic structure of free PpL domain C (B.J.S., unpub- 
lished data). Similarly, the variable region of the com- 
plexed Fabs superimposes with an rmsd of 0.6 A over 
21 S residues with the free structure determined pre- 
viously [13]. 

The interaction with PpL involves 13 residues from 
the Fab (Figures 1b and 1c). Ten are located in frame- 
work region 1 (FR1). The others are Lys-L107 from the 
segment connecting the V L to the d region, Glu-L143 
from the C region, and Arg-L24 from the CDR-L1 region 
[1 6] on strand B. However, this residue does not belong 
to the V t hypervariable loops according to the structural 
definition of Chothia [17] or to the positions frequently 
identified by the contact with antigens [18]. Compared 
with SpA, which binds to the variable region of the same 
human antibody (Fab 2A2) [1 3], PpL is farther away from 
the center of the antigen binding site (23 A compared 
to 1 6 A). Hence, like SpA, PpL binding should not affect 
accessibility to the antigen-combining site, as also sug- 
gested by competition assays [1 9]. 



Of the amino acids in the PpL domain, 12, located 
mainly on strand £2 and the a helix, are Involved in this 
Interaction (Figure 2). This interface is characterized by 
six hydrogen bonds (Figure 1 b, Table 1 ). Three are be- 
tween main-chain atoms located on PpL strand 32 and 
on strand A from the first Fab; they thus Join the 3 sheets 
of the Fab and the protein L into a unique sheet through 
a 3 zipper interaction. 

Heteronudear NMR spectroscopy has been used for 
mapping backbone positions of the PpL domain B1 [20] 
involved in the interaction with the V L region. Most posi- 
tions identified for domain B1 are also implicated in the 
first interface of domain C* described in the present 
study. These backbone positions are on strand 32 and 
the o helot of the PpL domains. A minor discrepancy 
with the NMR results concerns the loop between the a 
helix and strand 33, which does not make any contact 
with tha Fab V L region in the crystal structure of the 
complex. This loop is poorly defined in the NMR struc- 
ture, so the discrepancy can be attributed to a change 
in mobility upon complexatJon. 

The first Interface was subjected to site-directed mu- 
tagenesis of PpL domain C*. A 23-fold drop in affinity 
results from a Y53F substitution [21] as measured by 
competitive EUSA. This decrease following the loss of 
the tyrosine hydroxyl is as expected on the basis of the 
energy of a neutral hydrogen bond [22], consistent with 
the loss of the interaction between the Tyr-53 hydroxyl 
group and the Thr-L20 carbonyl group of the V L region 
(Table 1). Thus, the first interface explains the existing 



The Second V L -PpL Interface 
In the crystal, the single PpL domain is sandwiched 
between the V L regions of two 2A2 Fabs present In the 
asymmetric unit (Figure 1 a). The second interaction bur- 
ies a total solvent-accessible area of 1 400 A 1 , compara- 
ble to other protein-protein complexes [23]. The PpL 
domain C has a different orientation In the two interac- 
tions relative to the Fab 3 sheet. This second interaction 
involves 1 5 V t residues, located mainly on the 3 strands 
A and B (as in the first Interface) with some participation 
of strands D and E (Figure 1c). Out of the 15 residues 



10 are common to the first one. On the contrary, none 
of the PpL residues that contribute significantly to this 
interface are involved in the first one. The 1 4 amino acids 
from the PpL domain involved in this second Interaction 
come mainly from strand 33 and from the a helix (Figure 
2). Six hydrogen bonds and two salt bridges mediate this 
interaction (Table 2), as compared to only six hydrogen 
bonds for the first Interface (Table 1). Although unex- 
pected from biochemical studies, this second PpL-Fab 
interface buries an area too large to be a crystal contact 
Given that this interface buries a surface larger than 
1 400 A 2 , the probability that it is just a crystal contact 
can be evaluated as only 2% [24], Thus, we believe that 
this interface has to be given serious attention. 

What Is the strength of this second interaction? At 
present, no definite answer can be given, but we have 
used mutagenesis to probe the relative contributions of 
these two Interfaces. First, we constructed the Y64W 
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mutarrt of a PpL domain C* to hava an efficient fluores- 
ce"* probe so as to measure Kd by the stoppod-flow 
method pi). Second, we constructed two double mu- 
tants, Y53F-Y64W [21 J and D55A-Y64W (this study), with 
the purpose of weakening, respectively, the first and 



second Interfaces. In the Y53F-Y64W mutant, the re- 
placement of Tyr-53 by Phe disrupts the hydrogari bond 
between the tyrosine hydroxyl and the L20 carbonyl in 
the first interface. Similarly, we chose the D55A-Y84W 
mutant In order to disrupt the salt bridge between Asp- 




55 aid Arg-L24 as wel as the hydrogen bond with the 
Ser-L7 side chain tn the second Interface. These three 
single PpL domain co nstruct s are correctly folded, as 
Indicated by far-UV CD spectroscopy (data not shown). 
White the loss of a single hydrogen bond at the first 
Interface has a significant effect on the dissociation con- 
stant of the V L -PpL domain C* complex, the disruption 
of the salt bridge and the hydrogen bond mediated by 
Asp-55 of PpL at the second interface does not alter 
the Kd of the complex (Table 3). This suggests that 
the blncBng of single PpL domain C* is dominated by the 



domain of two Fab-combmlng regions, differing In affini- 
ties by about two orders of magnitude, has been ob- 
tained recently by anah/tlcal-centrifugation studies 
(BJ.S. and R Beavtl, unpublished data). 

This observation Is reminiscent of the 1:2stoichionie- 
try described for the Interaction between human growth 
hormone and its receptor [25], where two different faces 
of the hormone contact similar surfaces on two recep- 



tors, with the result being receptor activation [28]. By 
analogy, in our study a single PpL domain contacts 
similar surfaces on two V t regions and could bridge two 
Igs anchored at the membrane of B ceta. The biological 
relevance of this second interface and the potential mi- 
togenlc activity of a single PpL domain on B ceBs are 
under Investigation. 

V t Specificity of the PpL Binding Interaction 
PpL has been reported to bind efficiently to V L regions 
of about two thirds of the human Ig repertoire and to 
thus encompass mainly the k1, k3, and k4 subgroups 
[8] but neither k2 nor any \ subgroups. The PpL kappa 



s pi]. To analyze the 
structural determinants of this specificity, we have su- 
perimposed on the 2A2 V L region the equivalent struc- 
tures found on different kappa and lambda subgroups. 
We observed that PpL binding ability Is mostty concen- 
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tratad on the L5 to L12 segment of the V L region and 
that the interaction is dependent on the main-chain con- 
formation of this segment, on which conformation the 
p zipper interaction also depends. Particularly important 
are residues L8-L1 2, which bury more than 80% of their 
total accessible area. The V t residues of this segment 
and at other positions of the first interface are well con- 
served among the recognized V, regions (Table 4). The 
main chain superimposes well onto V L structures of 
the k1, k3, and k4 subgroups (rmsd of 0.3-0.5 A over the 
eight residues from L5 to L1 2). Most k2 subgroup se- 
quences havea proline residue at position LI 2 that Intro- 
duces major steric hindrance by pointing the Pro ring 
toward the PpL domain. For the X subgroup, two obser- 
vations account well for the absent or weak binding 
activity found for these chains. Firstly, the L5-L1 2 seg- 



interaction (rmsd of 1-1.2 A over the seven residues 
from LS to L12). Secondly, due to the reduced size of 
this segment, a cavity is created around positions L7 
and L8. The Fab recognition mode developed by the 
PpL domain Is highly dependent on the backbone con- 
formation, different from that of SpA, which relies more 
on a side chain-specific recognition mode [13]. 

Avidity of Whole PpL for Immunoglobulins 
PpL contains four or five highly homologous, consecu- 
tive extracellular Ig binding domains In tandem, de- 
pending on the strain. At what level are the positions 
involved In the first interface conserved between the 
different domains? Sequence alignment of the four C 



domains and Ave B domains with the C* domain (Figure 
2) shows that domain C1 , which differs at 20 positions 
out of 61 , is the most distant from domain C*. The differ- 
ences on C1 are distributed ail over the domain, with 
higher incidence on the a helix and strands 02, 03, and 
04. The other C domains have a higher degree of se- 
quence identity with C*. The B domains have from 10 
to 17 well-distributed changes compared to C*. 

Structurally, a core of seven critical residues can be 
identified. These residues are those largely buried upon 
complex formation or those involved in hydrogen bonds: 
Gln-35 to Lys-40 from strand 02 and Tyr-53 from helix 
a. These seven residues are strictly conserved in eight 
out of the ten PpL domains (Figure 2). Domains C1 and 
B5 each have only one nondisrupUve difference. The 
changes from Thr-36 to Asn in C1 and from Glu-38 to 
Thr in B5 could weaken the interaction but not disrupt 
it. Residue changes outside the structural core in do- 
mains C2 and C3 could result in a slightly lower affinity 
for k chains compared to that of domain C*. Residues 
49 and 52 from domain C* are, respectively, Glu and 
Arg, which make an internal salt bridge in the middle of 
the a helix. In domains C2 and C3, these residues are 
replaced, respectively, by Lys and Ala. This disrupts the 
salt bridge and places the Lys in close proximity to two 
Arg residues, and this further results in an unfavorable 
accumulation of positive charges. 

As is the case for the five SpA domains with respect • 
to their interaction with V H region [1 3], most PpL domains 
should also conserve ail their hydrogen bonding interac- 
tions. Thus, we would expect that all the PpL domains 
would bind to the V u region in a similar way so that a 
whole PpL would contain at least four Ig binding sites. 
The equivalence In binding between the domains Is con- 
sistent with previous studies on the binding properties 
of different PpL constructs. These studies show that 
avidity increases with the number of Ig binding domains 
[6], In fact, a four-domain PpL has an avidity 100-fold 
higher than the single domain, and a fifth Ig binding 
domain does not improve avidity further [6]. By superim- 
posing a PpL-Fab complex on each V L region of whole 
IgG [27, 28], we can show that the distance between 
the C terminus of a PpL domain bound to one Fab and 
the N terminus of the other bound PpL ranges from 
60-120 A depending on the Fab hinge angle. Four do- 
mains (including their interdomain linkers of 16 amino 





RgmX Comparison of SpQ-C1 and PpL- 
V L Interactions 

Oitontina ths Spa and PpL 0 shMtt In an 
analogous manner undarscoraa the structural 
slrnaarftJea of both Interaction*. Stand 02 of 
both domains Interact* with, raspacttvefy, an 
Fab V, (bh») or an Fab C1 (gnten) 0 <ti 



adds each) could span this distance, and this could 
explain the strong avkSty effect observed on Ig binding. 
The same pattern of modular domains binding to Igs Is 
shared by SpA and SpG, which like PpL seem to have 
evolved high avidity for Ig by cassette duplication to 



Structural Convergence between the Fab Binding 



two Infective bacteria [28]. With only 15% s 
Identity, the Ig binding domains are the least 
gous regions between these eel surface proteins. De- 
spite trie low sequence identity, those domains share a 
common fold, with an a helix packed against a four- 
stranded p sheet The NMR and crystal structures of 
both domains have shown that the main structural differ- 
ence ts the orientation of the a heBces [7, 30]. The heBx 
runs almost parallel to the (3 strand direction In the PpL 
(3 sheet, whereas ki SpG ft runs diagonally across the 
sheet This arises from a difference in the loop between 
the a helix and the strand (33, which Is one residue 
shorter In PpL Since this is the region of SpG that binds 
to Fc, it may in part explain the absence of Fc binding 
by PpL 

The Interaction of SpG with Ck1 and that of the first 
Interface of PpL with V L regions of Fab (Figure 3) share 
similar features. Both domains bury equivalent surface 
areas upon binding to the Fab and have two common 
structural features. Firstly, the same region of these two 



w this strand extends a fisheet of the Fab through 
a 3 zipper interaction. This Interaction involves five maln- 



crmln/main-cbain hydrogen bonds In the SpG-CI Inter- 
face [30] but only three hydrogen bonds In the PpL-V L 
first Interface. The external strand A of the V L region 
involved In the Interaction presents a bulge due to pro- 
line at position L8 in the middle of the strand, and this 
bulge shortens the p zipper. Although the SpG and PpL 
domains are clearly the result of divergent evolution [29], 
they have maintained a common binding strategy for 
interacting with the fl strands from different Ig domains 
(Figure 3) through a maJn-chaJn-to-maln-chajn hydrogen 
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repertoire of 
none of those bacterial de- 
recognition, they 
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may be used to aid the crystallization of antigen-Fab 
[32, 33J or membrane protein-Fv complexes [34]. We 
suggest that these small protein domains should be 
used in a combinatorial manner to increase the variety 
of possible crystal contacts that can be formed. This 
would help to solve some problematic crystallization 
cases. Knowledge of the structure of the PpL-Fab com- 
plex will facilitate the engineering of PpL variants with 



monomethyl polyethylene glycol (MPEG) 5000, 100 mM imidazole 
malate tpH 8.5]) with an equal volume of protein solution (5 mg/ 
ml) containing different Fab 2A2:PpL molar ration. The ratio was 
optfmlxed with the aid of streak seeding [37]. Interestingly, the com- 




Biological Implications 

Several pathogenic bacteria present cell surface pro- 



tion of host tii 
interact with a wide proportion of the immunoglobulin 
repertoire through interactions with constant regions of 
Immunoglobulin that are highly conserved in sequence. 
Two bacterial proteins. Staphylococcus aureus protein 
A (SpA) and Peptostmptococcus magnus protein L 
(PpL), interact with a wide range of antibodies by tar- 
geting variable, rather than constant, regions of Fab. 

We recently reported the crystal structure of an SpA 
domain in contact with the variable-heavy (VJ region of 
an Fab [1 3] and we now report the crystal structure of 
a PpL domain In complex with the variable-light (VJ 
region of the same Fab. Both bacterial domains interact 
with the framework part of these variable regions without 
contacting the hypervariable loops. The positions identi- 
fied In both complexes account well for the wide reper- 
toire of immunoglobulins recognized by both bacterial 
domains. In contrast with SpA which targets conserved 
residues of the V H external 0 sheet, PpL targets a portion 
of the V,. region that is not strictly conserved in sequence 
but that maintains a constant backbone conformation 
among most V u regions encoded by k genes. 

A single domain of this muitidomain protein manifests 
PpL activity. The PpL domain appears to have two Inde- 
pendent binding sites for V L , which interact with very 
similar areas on the light chain but with markedly differ- 
ent affinities. The residues involved in the V L Interaction 
are well conserved among the different PpL domains. 
The interaction of the single domain accounts well for 
two of the properties reported for the whole protein L. 
Firstly, PpL exhibits an avidity effect for whole Immuno- 
globulin, and the overall position of a single bacterial 
domain suggests that a whole PpL with up to five do- 
mains In tandem should be able to bind the two V L 
regions of an IgQ. Secondly, PpL could induce histamine 
release by bridging two IgE bound to their Fct receptors 
at the surface of human basophils or mast cells. The 
conservation of PpL positions in all domains is also in 
agreement with this last property. 



Crystallization and Data Collection 
The 2A2 IgM rheumatoid factor was 
created from synovial B cells 



Is patient [35]. 
in cleavage of the IgM [35]. The V L 
subtype recognized by PpL The 



MPEQ 5,000, 25%(v/v) ethylene glycol, 9%(v/V) xylite!, 100 mM 

HEPES, pH 7.5). Data were rec 

on DW32 beamline at LURE sy 

frig the HKL package [38]. The crystal belongs to 

P2,2,2, wrth a - 55.2 A. b - 87.3 A and c 

Table 5. 




a single PpL 
l Both the NMR 

e of the 81 domain [7] and the crystal structure of the C 
domain of PpL (B.J.S, unpublished) were fitted globally In the elec- 
' '—- " " It for the crystal structure. The 2 




. he V H and V L regions of 
Hi Fabs, the PpL domain, and In particular the V L -PpL domain 
interfaces are well defined In the electron density maps (FIgurelb). 
aattstiea^the structure determination are summarized m Table 5. 

The Y84W PpL and the Y53F-Y84W PpL mutants were constructed 
ribed previously [21], The D55A- 

DNA by the use of the antisense primer 5' TGC TAA TAA 
5 ATA TOT 3'. The site of the mutation is shown In bold, 
were confirmed by DNA sequencing with the Seque- 
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Introduction 

Light chain-associated (AL) amyloidosis is 
characterized by the deposition in tissue of 
monoclonal light chain-related components - a 
pathologic process that leads to organ failure and eventual 
death. The lack of information on the etiology of this disease 
and the fact that there is no effective treatment to prevent 
or remove the abnormal tissue deposits represent major 
scientific challenges, the solutions of which may be relevant 
to other amyloid-associated disorders. 

The purpose of this review is to summarize current 
research efforts that are directed towards elucidation of the 
protein and host factors thought to be responsible for the 
induction and perpetuation of AL amyloidosis. The insights 
that have emanated from these studies and the availability 
of new technologies, especially in structural and molecular 
biology, form the basis of future research. The ultimate goal 
is to apply this information clinically through design of 
novel therapeutic strategies that will be used to overcome 
the devastating impact of this disease. 



Protein factors in the pathogenesis ofAL amyloidosis 

Over 20 years have passed since Osserman and 
colleagues documented the almost invariable presence of 
serum or urinary monoclonal immunoglobulins (Igs) in 
patients diagnosed with what was previously termed 
primary amyloidosis' 3 . The prediction that these molecules 
were involved in the pathogenesis of this disease 2 was 
confirmed when Glenner and co-workers (and subsequently, 
many other investigators) demonstrated that the amyloid 
deposits occurring in such patients were fibrillar in nature 
and composed of monoclonal light chains or, more 
commonly, light chain variable-region (V L ) fragments 3 . The 
unequivocal relationship between secreted and deposited 
light chains has since been established through comparative 
sequence analyses of Bence Jones proteins and amyloid 
proteins obtained from individual patients 3 - 4 . 

Historically, research on AL amyloidosis has been 
directed mainly towards characterizing the primary 
structural features of monoclonal light chain amyloid- 
associated proteins - i.e., those extracted from amyloid 
deposits or excreted in the form of Bence Jones protein, as 
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well as molecules encoded by DNA cloned from bone 
marrow-derived monoclonal plasma-cell populations. The 
goal of these studies has been to determine if particular 
chemical features of the light polypeptide chain render it 
amyloidogenic. A voluminous body of sequence data has 
been generated that describes apparently novel amino acid 
substitutions in the V L of such light chains 5 . Additionally, 
in some cases, post-translational modification (e.g., 
glycosylation) has been thought to account for this 
phenomenon 6 . However, given the rarity of carbohydrate- 
containing light chains and the fact that these components 
do not invariably form amyloid deposits, the functional or 
pathophysiological effects of glycosylation or other post- 
translational modifications (deamidation, etc.) are not 
known. 

Light chain primary structure 

In an effort to differentiate amyloid proteins from those 
considered non-amyloidogenic, the primary structures of 
representative components have been compared. 
Unfortunately, these studies are confounded by the fact that 
while there is a relatively large referenced database on the 
former, the number of documented benign, "non-amyloid," 
or non-nephrotoxic light chains (Bence Jones proteins that 
are not associated with myeloma [cast] nephropathy and 
light chain deposition disease) is markedly limited. Further, 
those proteins designated "non-amyloidogenic" may, in fact, 
represent amyloid- associated light chains derived from 
patients with unrecognized disease. 

Another factor that complicates comparative analyses 
results from the variability inherent in light chain structure: 



First, there are two types of light chains, K and X, each 
having distinctive V L (and constant [CJ) domains 5 - 7 . The 
V L portion of the molecule is the product of two exons - V 
and joining (J) - that encode, respectively, the first -95 and 
following -13 residues. The V K and V k domains are 
characterized by three hypervariable or complementarity- 
determinrng regions (CDRs) and four framework regions 
(FRs) that are involved, respectively, in the antigen-binding 
site and in maintaining the structural integrity of the 
molecule (Figure 1). Diversity in sequence results from the 
presence of multiple, functional V x - and V r (-35 each) 
and J K - and J r (5 and 4, respectively) germline genes, as 
well as the recombinatorial process that links the V and J 
segments (variation in the length of CDR.3 at the V-J joint 
can also result from the presence at position 96 of non- 
template-encoded, extra residues that are the products of N 
or P nucleotides) 8-13 . 

Allelic differences can also account for diversity in 
primary structure, but of greater importance is the 
pronounced variability that is introduced in light chain- 
encoded germline sequences by somatic mutation. As yet, 
there has been no report of an amyloid associated light chain 
that is completely identical in sequence to that encoded by 
a V K - or V^-germline gene; however, the contributory role 
of somatic mutation to light chain amyloid formation 
remains to be established. 

Structural features related to amyloidogenicity 

What are the structural features that account for the 
amyloidogenicity of certain light chains? It is noteworthy 
that in patients with AL amyloidosis X-type monoclonal 




FIGURE 1 . Schematic representation of V L and C L domains and their genetically-encoded segments. The V L is the product of two gene 
segments, V and J, that encode, respectively, the first -95 residues and the remaining -13 residues of the V region. The C L is the product 
of a single gens, designated C, that encodes for the remaining -1C6 residues of the light chain. The location of the three hypervariable 
regions or CDRs and of the four FRs are as indicated; the J-encoded portion of the V L encompasses the terminal residue(s) of CDR3 and 
all the FR4 residues. The wavy lines symbolize the location of additional amino acid residues found within the FRs and CDRs. The residue 
numbering system is according to Reference 5. 
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Igs predominate, in contrast to the prevalence of K-type 
components in other plasma-cell dyscrasias such as multiple 
myeloma, light chain deposition disease and Waldenstrom's 
macroglobulinemia 14 . Such overrepresentation of X chains 
may be attributed to the fact that these components typically 
exist as covalent dimers, in contrast to k molecules that 
occur in the monomeric as well as dimeric form 15 . This 
difference in quaternary structure may reduce the renal 
catabolism of X-type components or, alternatively, increase 
their propensity to form amyloid. Most remarkable has been 
the finding that proteins of one particular VX subgroup - 
V XV] - are invariably associated with this process 16 . Despite 
the relative rarity of this gene family in the normal X-chain 
population (~5%), we have documented that 37% (13 of 
35) of X-type Igs obtained from patients with proven 
amyloidosis were of this isotype 17 . Among the primary 
structural features that distinguish these proteins are the 
presence of a unique two-residue insertion in FR3 that 
includes an Asp residue at position 66, as well as a Lys 
residue at position 1 7 in FR1 . Although the tertiary structural 
features of XVI light chains have not as yet been elucidated, 
the interactive potential of these two oppositely charged 
amino acids may so alter the conformation of these 
molecules as to render them especially amyloidogenic (F. 
J. Stevens, personal communication). 

To date, comparison of sequence data derived from 
amyloid and "non-amyloid" proteins (both X- and K-type) 
has failed to reveal a single, site-specific residue that 
differentiates between these two light chain populations. 
This finding (as well as the apparent irrelevance of 
glycosylation) differs from the case of the 
hemoglobinopathies where, in the case of hemoglobin S, 
for example, the one residue Glu/Val substitution in the B 
chain so alters the molecule that when deoxygenated 
sickling occurs 18 . 

In contrast to the theory that a single amino acid can 
render a light chain amyloidogenic, Stevens et al." proposed 
an alternative mechanism to account for this phenomenon; 
namely, this process is based on a series of interactions 
between V L -associated amino acid residues, first between 
those in monomers leading to the formation of dimers, then 
between dimers to form pre-amyloid filaments and finally, 
between filaments to form fibrils (Figure 2). Because 
multiple positions within the V L participate in B-domain 
interactions that result in V L dimer formation and ultimately 
amyloid fibrils (e.g., salt bridges, hydrogen bonds, or van 
der Waals contacts), almost any amino acid substitution 
could theoretically lead to an increase or decrease in the 
propensity of such interactions to occur, thereby modifying 
the tertiary structural features of the light chain and causing 
it, under appropriate physiological conditions, to become 
amyloidogenic. Conversely, particular amino acid 



substitutions could, in fact, inhibit light chain interactions 
and thus render the protein "non-amyloidogenic." The 
model satisfies several experimentally defined features of 
AL amyloid, including fibril diameter and the presence of 
crossed B sheets that are aligned with the fibril axis and 
create an intrinsically self-stabilizing assembly. 

This hypothesis was supported by the finding that 
amyloid-associated light chains contained potentially 
interactive, chemically distinct residues located at key 
positions within the V L domain 20 . A position-by-position 
comparison of the frequency of particular residues in 180 
human monoclonal k- and X-type proteins, 52 of which were 
obtained from patients with known amyloidosis, revealed 
statistically significant differences in the CDRs and FRs of 
the two populations. For example, amyloid-associated 
molecules were most often characterized by the presence 
in the CDRs of sterically accessible, charged residues (e.g., 
Asp). Alternatively, in FR2, the typical basic residue found 
at position 45 in non-amyloid proteins was replaced by Asn 
and in FR1 at position 20, amyloid constituents had a 
hydrophobic He residue instead of Thr or Ser. For both K 
and X light chains, the majority of amyloid-associated 
positions were distributed along the surface of the light chain 
involved in the antigen-binding site and, therefore, would 
not participate in the internal packing of the V L domain. 
Further, the side chains of residues at such positions would 
extend into the solvent and be accessible for interaction 
with other molecules in solution. 

These chemical differences that potentially modify light 
chain tertiary structure were also predicted to enhance V L - 
dimer formation and result in protein aggregation. Earlier 
studies, using size-exclusion chromatography, demonstrated 
that, on the basis of polymer formation, Bence Jones 
proteins obtained from patients with amyloidosis or other 
types of pathologic light chain deposits could be 
distinguished from non-toxic components 21 . The 
demonstration that amyloid-associated proteins share 
unique primary structural features and readily aggregate in 
an in vitro system provides further evidence for the three- 
step molecular mechanism by which light chains form 
fibrils, i.e., dimer -* filament -> fibril 1 '- 20 . 

Wetzel and colleagues 22 have hypothesized a somewhat 
different theory to explain light chain amyloidogenesis. 
They posited that, while multiple substitutions at key sites 
in the V L are responsible for light chain amyloidogenicity, 
this phenomenon of amyloid fibril formation results because 
such replacements so modify the tertiary structure that the 
protein becomes partially or completely unfolded. It is this 
intermediate form, then, that is the culprit in AL amyloid 
formation (as has been postulated to occur in the 
transthyretin-associated amyloidoses 23 ). Experimental 
support for this theory was obtained using V L fragments 
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FIGURE 2. Computer-based models of V\. interactions leading to the formation of amyloid fibrils. (A) Backbone representation of a V L 
dimer. This basic structure has been found for most Ig light chains which have been studied cryatailographlcaRy. The shading surrounding 
the "backbone" Indicates the volume occupied by atoms. The portion of the molecule that constitutes the equivalent of an antibody's 
antigen-binding site Is located at the "top" of the molecule in this representation; among Bght chains, most amino acid substitutions occur 
in this area. (B> Hypothetical arrangement of V L dimers during the formation of non-covalent polymers. The antigen-binding site of one 
dimer interacts with the surface of the anally -opposed end of a second dimer following a 90° rotation about the two-fold symmetry axis. 
Many amyloid-forming light chains have amino adds of opposite charge, as highlighted m this figure, located In positions to form high- 
affinity "salt bridges.' A consequence of the BnWng of V L domains Is that any weak "non-specific" affinity of the dimer surface for particular 
tissue surface features, such as proteoglycans, is amplified. Thus, a light chain, which individually would not interact with an organ or 
tissue, might be specifically bound to an organ when polymerized. (C) interaction of V L -dlmer filaments. When viewed from a different 
angle, the filament shown in (B) has a convoluted or "sawtooth" surface that makes it possible to bring together two or more filaments. In 
this figure an anti-parallel mechanism of bringing filaments together is illustrated: several amino acid positions at which substitutions could 
either enhance or suppress the docking of filaments are highlighted. (D) Space-filling representation of the proposed V L -dlmer polymerization 
model. (Figure furnished by Dr. Fred J. Stevens, unpublished data) 
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derived from DNA recombinant technology. These 
molecules, expressed in a bacterial system, were based on 
the V K dimer REI - a presumably non-amyloidogenic 
protein, the tertiary structure of which had been elucidated 
by x-ray crystallography 24 . Six V L constructs were prepared 
that contained replacements thought to be unique in K- or 
X-type amyloid-associated light chains. These 
modifications, due to their locations, were expected to 
destabilize the folded structure of the V L domain. In contrast 
to the wild-type REI component, all six mutants, at low 
pH, formed aggregates and bound Congo red. When 
examined in the presence of guanidine hydrochloride 
(GdnHCl), four of the six were unstable and aggregated to 
produce amyloid-like fibrils. The most destabilizing effect 
occurred when a highly-conserved Arg at position 61 was 
replaced by Asp. This substitution was predicted to eliminate 
or decrease the potential to form a salt bridge/hydrogen bond 
with another Asp residue located at position 82 in an 
adjacent loop. Although the GdnHCl in vitro stability assay 
may provide a means to predict the fibrillogenic capacity 
of a Bence Jones protein, the relevance of this system to in 
vivo amyloid formation remains to be established. 

The aforementioned studies involving comparative 
analyses of light chain sequence have made it increasingly 
evident that such data per se are not sufficient to differentiate 
amyloid from non-amyloid components. In order to define 
more precisely the relationship between particular amino 
acid residues and light chain amyloidogenicity, it is 
necessary to determine how variations in sequence affect 
the overall tertiary structure of the molecule. 

Because amyloid proteins cannot be crystallized, 
computer modeling offers a powerful new tool to predict 
the three-dimensional impact imparted on a protein by 
modification of and interactions between amino acids. 
Additionally, the ability to generate via recombinant 
technology a replenishable source of human light chains in 
which primary structural modification can be induced 
represents an important means to advance our understanding 
of the protein basis of amyloidogenicity" 25 ". This 
technique, as well as computer technology, provides exciting 
new approaches to research in this field, since manipulation 
of key amino acids and, most importantly, visualization of 
resulting interactions should provide information on why 
certain proteins form amyloid. 

Light chain fragmentation in AJL amyloidosis 

The structural requirements for the in vivo conversion 
of soluble light chain precursor proteins into insoluble 
amyloid fibrils have not been definitely established - e.g., 
it is unknown if the native protein itself can form amyloid 
fibrils in vivo or if this process is initiated or accelerated by 
light chain fragmentation. In rare instances, the protein 



extracted from AL amyloid deposits was found to be 
composed exclusively of an intact light chain 4 ". More 
commonly, these extracts consist of light chain fragments 
(V L or V L plus a portion of the C L ) 3 . It has not been 
conclusively established if these components are produced 
de novo 28 ' 2 * or result from catabolism or in situ proteolysis 
of intact light polypeptide chains. The fact that we have 
found in virtually all AL amyloid extracts examined a 
certain, albeit small, amount of the complete light chain 
suggests that the common occurrence of fragments 
represents a degradative process. It is noteworthy that, based 
on the tertiary structural features of light chains, as 
evidenced by x-ray crystallography, the C-terminal residues 
of amyloid fragments (in positions 152-154) are typically 
located in sterically exposed regions of the molecule; such 
sites would thus be accessible for endopeptidase digestion. 
We have been able to generate similarly sized fragments 
using various types of endoproteases (e.g., cathepsin D, 
pepsin, or that contained in a lysosomal extract' prepared 
from kidney). Additionally, we have found that amyloid 
fragments isolated from different organs from the same 
patient can vary in molecular mass 30 . Although such 
variations may reflect deposition of synthetically-derived 
light chain fragments, it is also possible that light chains 
are deposited in a relatively intact form as amyloid and that 
degradation occurs as mediated by local tissue factors. If 
light chain proteolysis is, indeed, essential for amyloid 
formation, the presence or extent of light chain 
fragmentation could have prognostic or therapeutic 
implications. 

Experimentally, it has been shown that endoprotease 
digestion of certain Bence Jones proteins obtained from 
patients with or without amyloidosis yielded V L fragments 
that had the characteristic features of amyloid. 31 ' 34 This 
finding may indicate that the C L portion of the molecule 
interferes with amyloid formation or, as has been shown in 
vitro, is remarkably susceptible to proteolysis as compared 
to the V L domain 33 . The demonstration that regions within 
the light chain are potentially amyloidogenic has come from 
studies of Eulitz and co-workers 36 ' 38 , who found that tryptic 
digestion of certain extensively reduced and alkylated 
amyloid-associated Bence Jones proteins yielded 
precipitates that exhibited green birefringence after Congo 
red-staining and were fibrillar when viewed by polarization 
light microscopy and electron microscopy, respectively. The 
insoluble tryptic digests, when rendered soluble in 
trifluoroacetic acid and subjected to HPLC- 
chromatographic separation on a reversed-phase column, 
were found to contain characteristic 18- to 43-residue V L - 
and C L -related peptides (this discovery was analogous to 
the observation that synthetic 10- to 28-residue peptides 
corresponding to portions of the Alzheimer's disease 
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amyloid-associated B protein, the islet amyloid polypeptide 
[1APP], or prion protein (PrP) could form amyloid fibrils 
in vifro 39 - 43 ). More recent studies in our laboratory have led 
to the identification and synthesis of 12- to 24-mer V L - 
related peptides that are particularly fibrillogenic (Ref. 30 
and Niewold, Th. A. et al., unpublished studies). These 
findings suggest that, under appropriate conditions, all light 
chains are potentially amyloidogenic and that the 
susceptibility of these components to proteolytic cleavage 
may depend on other factors involved in light chain 
catabolism. That certain Bence Jones proteins can 
themselves function as peptidases 44 could also be of 
pathologic significance; however, the relevance of this 
phenomenon to the in vivo production of amyloid remains 
to be established. 

If, indeed, 12- to 24-mer "amyloid" peptides are 
generated in vivo through proteolysis of precursor 
monoclonal light chains, such components may contribute 
to the pathogenesis of AL amyloidosis. Lansbury and 
colleagues 45 "" have proposed that amyloid formation 
requires or is accelerated by a seeding mechanism whereby 
preformed amyloid fibrils serve as a nucleation substance 
that initiates conversion of a soluble precursor protein or 
peptide into amyloid. Three factors were deemed essential 
for a nucleation-dependent polymerization process to occur: 
First, a lag time is necessary before detectable aggregates 
appear; second, there must be a critical concentration of 
the monomer before polymerization occurs; and third, 
polymerization is accelerated by a preformed "seed" or 
nucleus. That this phenomenon may ensue in vivo has been 
evidenced by the rapid formation of amyloid in the 
experimental hamster and mouse AA models 45 ' 50 through 
the administration of "amyloid-enhancing factor" (AEF) - 
a sonicated preparation of an organ that contains either AA- 
or AL-related amyloid fibrils. 

However, in vitro experiments have indicated that 
seeding is protein-specific; i.e., there must be 
complementarity between seed and some portion of the 
precursor protein for growth to occur, as shown in the case 
of BA4 and prion fibrillogenesis 43 51 ". The demonstration 
that amyloid formation can be accelerated by an apparent 
nucleation-dependent polymerization mechanism provides, 
as suggested by Jarrett and Lansbury 46 , a rational therapeutic 
approach - namely, reduction of precursor protein 
concentration and interference with the generation of the 
nucleus should slow this relentless process. In this regard, 
a,-antichymotrypsin and a silicate compound (Na^iOJ 
were found to inhibit BA4 fibrillogenesis 54 - 55 . Another agent 
- 4'-iodo-4'-deoxydoxorubicin (1DOX), an iodinated 
anthracycline compound was shown to bind with high 
affinity to AL as well as other types of amyloid deposits 
and also interfered with in vitro fibrillogenesis. When 



incubated with AEF, this agent significantly reduced AA 
amyloid formation in an in vivo experimental animal 
system 56 . Further, five of eight patients with AL amyloidosis 
who were treated with this compound had objective 
evidence of amyloid resorption, despite no diminution in 
Bence Jones proteinuria 57 . 

Pathologic heterogeneity of light chain deposits 

Although the three major forms of the human light 
chain-associated diseases (myeloma [cast] nephropathy, 
light chain deposition disease and AL amyloidosis) exist as 
discrete entities 14 - 58 - 5 ', the coexistence of two distinct forms 
of light chain-associated disease - e.g., light-chain 
deposition disease and amyloidosis - has been noted 5 '- 62 . It 
is not known whether each form results from the presence 
in individual patients of multiple Bence Jones protein 
components, from the potential of a single Bence Jones 
protein to induce more than one type of pathology, or from 
a natural progression of one disease state to another, as 
evidenced by the common involvement of blood vessel 
walls in non-fibrillar and fibrillar forms of light chain 
deposition. 

The fact that a monoclonal light chain can assume 
multiple conformers according to the solvent used for 
crystallization"- 65 provides evidence that physiological 
factors can profoundly influence V L dimer interaction. The 
failure of a particular Bence Jones protein to form amyloid 
might be attributed to the greater propensity of the protein 
to aggregate as amorphous casts (myeloma [cast] 
nephropathy) or as punctate, linear deposits (light chain 
deposition disease). Alternatively (as previously discussed), 
particular amino acid substitutions may prevent V t 
aggregation or, if light chain fragmentation is a prerequisite 
for amyloid formation, such substitutions may so modify 
the native conformation of the light chain that protease- 
sensitive sites in the C L domain are rendered inaccessible. 
Furthermore, local factors may change the protein thus 
deposited: Initially the deposits may be non-fibrillar but 
are subsequently modified and assume the B-pleated 
structure typical of amyloid fibrils. 

Host factors in the pathogenesis of AL 
amyloidosis Accessory Molecules 

Interactions between monoclonal light chains and other 
biologically active molecules may also lead to pathologic 
deposition. For example, myeloma (cast) nephropathy has 
been thought to be the product of precipitated complexes 
comprised of Bence Jones protein and a component 
produced by the renal distal tubules - Tamm-Horsfall 
protein 64 " 6 *. In the case of AL amyloid, the deposits, in 
addition to light chains, have been shown to contain several 
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chemically diverse substances, including amyloid P 
component", the highly-sulfated glycosaminoglycans 
(GAGs) heparan sulfate and dermatan sulfate 48 -' 0 and 
apolipoprotein-E (Apo-E)' 1 . Other types of molecules, such 
as ubiquitin, complement-related components, collagen, 
fibronectin and laminin, have been detected as well. The 
demonstration that certain of these compounds (e.g., Apo- 
E) can accelerate fibril formation in vitro has led to their 
designation as "pathologic chaperones 71-73 ." It has been 
proposed that these molecules induce a B-pleated-sheet 
conformation of the precursor protein that, in turn, leads 
eventually to fibril formation. However, we and othcrs"- 34 - 36 - 
38 - 74 have demonstrated that isolated light chains and related 
fragments can form amyloid fibrils in vitro in the absence 
of ancillary compounds. Thus, it is possible that the effects 
of these substances are secondary; i.e., binding is non- 
specific and results from the highly adsorptive nature of 
the B-pleated-sheet structure of the amyloid fibril. The 
"pathologic chaperones," which are not found in non- 
amyloid forms of light chain deposition 62 - 72 , may not 
necessarily facilitate fibrillogenesis but, rather, participate 
in other functions that are part of the amyloidogenic process 
- e.g., GAGs may play an important role in selective organ 
localization (see below) of AL amyloid deposits and P 
component has been implicated as a factor that can inhibit 
or prevent the resolution of this material 75 . 

Light chain metabolism 

Essential to the pathogenesis of AL amyloidosis is a 
continuous endogenous source of precursor protein, i.e., 
the monoclonal light chain. However, the amount of protein 
required for this process is unknown. Since the catabolic 
half-life of light chains is normally < 2 h, the concentration 
of these components in serum or urine reflects only a small 
percentage of the protein synthesized by the monoclonal 
plasma-cell population 76 . In contrast to patients with 
multiple myeloma, individuals with AL amyloidosis most 
often have a low percentage of bone marrow plasma cells 
(median 5%) and minimal Bence Jones proteinuria (median 
0.4 g/24 h) 77 . Although there is evidence to suggest that a 
population of monoclonal B cells circulates and serves as 
precursor to the bone marrow-derived plasma cells 78 , it is 
unlikely that there exists a significant extramedullar 
cellular source of monoclonal light chains (except in die 
case of localized AL amyloid deposits 79 ). There is as yet no 
evidence to indicate that the light chain synthetic rate of 
monoclonal plasma ceils obtained from patients with AL 
amyloidosis is greater than that of plasma cell populations 
associated with other forms of pathologic light chain 
deposition. The fact that patients with AL amyloidosis can 
have extensive disease despite a seemingly low serum or 
urine concentration of precursor proteins suggests (in the 



absence of an inordinately high rate of synthesis) an 
exceptional propensity of these molecules to form amyloid. 
It is also possible that such patients may have unusually 
high concentrations of light chain binding factors 80 ' 82 or a 
molecular defect that limits or prevents the elimination of 
the AL deposits. 

Organ diversity of AL deposits 

Central to the pathophysiology of this disease is the 
remarkable diversity that exists in the organs affected in 
patients with AL amyloidosis 77 - 83 . In some individuals, the 
deposits are confined principally to the kidney and in others 
to the heart, small intestine, or peripheral and/or sympathetic 
nerves, etc. Further, they may be primarily vascular, 
interstitial, or both. Our analyses of protein extracted from 
spleen and other organs have shown no relationship between 
the molecular mass of the deposited material and the 
affected tissue 30 . Whether selective tissue affinity is related 
to specific primary structural features of the light chain that 
result in an interaction with local tissue factors or to an 
antibody-like affinity of certain light chains for particular 
tissue constituents remains to be established. Previous 
attempts to demonstrate organ specificity in vitro using 
fluorescein-labeled native Bence Jones proteins obtained 
from patients with particular organ involvement, although 
showing qualitative differences, were considered 
inconclusive'. 

The demonstration by x-ray crystallography that Bence 
Jones proteins and V L dimers can structurally mimic Fab 
fragments, including the antigen-binding site 19 , implies that 
the selective organ deposition of light chains may represent 
an antibody-Iigand interaction. Alternatively, AL amyloid 
deposition may be due to the synthesis of the amyloidogenic 
precursor protein by a local monoclonal plasma-cell 
population. This phenomenon would be analogous to the 
site-specific production of other types of amyloid - e.g., 
ACal, AANF, AIAPP, A (3 - associated with precursor 
proteins (calcitonin, atrial naturetic factor, islet amyloid 
polypeptide, (JA4 protein, respectively). Another 
mechanism that may account for the localization of AL 
amyloid deposits within particular organs involves the 
interaction between precursor light chains and tissue- 
specific molecules - e.g., GAGs. It remains to be determined 
if protein structure, organ affinity, or local production of 
light chains, etc. is responsible for the clinical diversity 
found in localized amyloid deposition. 

Pathophysiology of AL amyloidosis 

Other physiologic factors analogous to those implicated 
in myeloma (cast) nephropathy may be involved in the 
pathogenesis of AL amyloidosis . For example, the 
precipitation of Bence Jones proteins in the form of renal 
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tubular casts is accelerated by dehydration or other 
conditions that can adversely affect renal function - e.g., 
hypercalcemia, anemia, or infection 1 ' 1 . Due to the profound 
effect of solvent composition on light chain tertiary 
structure 43 , it is possible that, under appropriate conditions, 
soluble precursor light chains can be induced to form 
amyloid fibrils. 

Other host (patient)-related elements that could account 
for AL amyloid pathogenesis include the local processing 
by macrophages of soluble Bence Jones protein to form 
amyloid fibrils 84 or the failure of these cells to remove or 
prevent fibrillar light chain deposits. In contrast to the typical 
giant-cell reaction to Bence Jones protein-containing casts 
in myeloma (cast) nephropathy, characteristically, there is 
scant inflammatory response to AL amyloid deposits, 
presumably since this material is weakly or non- 
immunogenic 85,86 . With rare exception, AL amyloid deposits 
are apparently irreversible. The failure of the host to degrade 
AL amyloid fibrils may result from a deficiency in 
macrophage function or lack of a requisite proteolytic 
enzyme (as postulated in the pathogenesis of brain amyloid 
in Alzheimer's disease 87 ' 88 ). The coating of amyloid fibrils 
by "pathologic chaperones" - e.g., P component" - may 
also interfere with fibril degradation. Although there is no 
documented difference in the association of such 
components with X- or K-type amyloid fibrils, it has been 
reported that al -antitrypsin was capable of disaggregating 
X- but not K-chain-containing fibrils 8 '. 

In Vivo models ofAL amyloidosis 

Further insight into the pathogenesis and, ultimately, 
the effective treatment of AL amyloidosis depends, in part, 
on the availability of in vivo experimental models that 
duplicate the human form of this disease. We have reported 
the results obtained with one such model whereby mice 
were repeatedly injected with different Bence Jones proteins 
obtained from patients with AL amyloidosis. In these 
experiments, the human proteins were deposited in the 
kidney and other organs of the recipient animals in the form 
of amyloid' 0 ". The human light chain amyloid also 
contained mouse amyloid P component. Conversely, no 
deposits were found in mice injected under similar 
conditions with a "control," i.e., non-amyloid, Bence Jones 
protein. Due to the relatively large amount of protein needed 
for injection and the length of time required for the amyloid 
to form, other models in which human amyloid-associated 
light chains are produced trangenically or by transfectomas 
may prove more suitable. The development of light chain- 
related amyloid in such in vivo models will provide an 
invaluable means for further research on the pathogenesis, 
treatment and prevention of AL amyloidosis. 
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abstract: Light chain, or AL, amyloidosis is a pathological condition arising from systemic extracellular 
deposition of monoclonal immunoglobulin light chain variable domains in the form of insoluble amyloid 
fibrils, especially in the kidneys. Substantial evidence suggests that amyloid fibril formation from native 
proteins occurs via a conformational change leading to a partially folded intermediate conformation, whose 
subsequent association is a key step in fibrillation. In the present investigation, we have examined the 
properties of a recombinant amyloidogenic light chain variable domain. SMA, to determine whether partially 
folded intermediates can be detected and correlated with aggregation. The results from spectroscopic and 
hydrodynamic measurements, including far- and near-UV circular dichroism, FTIR, NMR, and intrinsic 
tryptophan fluorescence and small-angle X-ray scattering, reveal the build-up of two partially folded 
intermediate conformational states as the pH is decreased (low pH destabilized the protein and accelerated 
the kinetics ot greg ti i) A relatively nativelike intermediate. In, was observed between pH 4 and 6, 
with little loss of secondary structure, but with significant tertiary structure changes and enhanced ANS 
binding, indicatin > xpo ed hydrophobic surfaces. At pH below 3, we observed a relatively unfolded, but 
compact, intermediate, uj, which was characterized by decreased tertiary and secondary structure. The Iy 
intermediate readily forms amyloid fibrils, whereas In preferentially leads to amorphous aggregates. Except 
at pH 2, where negligible amorphous aggregate is formed, the amorphous aggregates formed significantly 
more rapidly than the fibrils. This is the first indication that different partially folded intermediates may 
be responsible for different aggregation pathways (amorphous and fibrillar). The data support the hypothesis 
that amyloid fibril formation involves the ordered self-assembly of partially folded species that are critical 
soluble precursors of fibrils. 



Immunoglobulin rig}' light chains are involved in several 
protein deposit! n d se; es >• • 1 iding one resulting in the 
formation and deposition of amyloid fibrils (light chain or 
> my loid ) a mother known as light chain deposition 
isea ii ) » .ii hous protein deposits (1. 2). The 
morphology of the deposited aggregates in these two diseases 
is clearly different, and typically patients exhibit only one 
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form of light chain deposition. However, there is at least 
one report of a patient exhibiting both AL amyloidosis and 
LCDD involving the same light chain (J). The exact length 
of Mght chains in amyloid deposits varies, but is usually in 
the 1 10-130 residue range (12-14 kDa) corresponding to 
the intact variable domain (4). 

The molecular mechanisms leading to amyloid formation 
are poorly understood. In this report, we address the question 
of why some immunoglobulin light chains form amyloid and 
related deposits while others do not, in particular the 
hypothesis that protein aggregation arises from the self- 
association of partially folded intermediates. Support for this 
hypothesis has been found with proteins such as transthyretin 
(J) and lysozyme (6). We postulate that amyloid fibril 
formation from native proteins occurs via a conformational 
change leading to formation of a partially folded intermediate 
conformation, association of this intermediate to form soluble 
oligomers leading to the critical nucleus, and subsequent 
1 i t of the t ia 'i t 1 a i c } < i l\ a filament 
or protofibril, and finally association of protofibrils into 
mature fibrils (7). 

We have investigated the biophysical properties and 
amyloidogenicity of the variable domain of a recombinant 
amyloidogenic light chain, SMA, engineered by Stevens ct 
al. (8). SMA (114 residues, A/ r = 12 700) was initially 
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extracted from lymph node-derived amyloid fibrils of an AL 
amyloidosis patient (9). A very similar light chain domain. 
LEN, was derived from a patient with multiple myeloma 
who showed no evidence of renal dysfunction or amyloidosis 
{10). The IgG-V), domains consist of a highly conserved 
framework formed by two sheets of antiparallel ^-strands 
forming a ^-sandwich, and three loops comprising the 
complementarity-determining egion (CTO i hat form part 
of the antigen binding site. The sequences of SMA and LEN 
are very similar, differing only at 8 positions out of 1 14. 
Four of these are in CDR3 (Q89H, T94H, Y96Q, S97T), 
two are in CDR1 (S29N. K30R g and the remaining two are 
in the framework region (P40L. 1106L). The high-resolution 
erystallographic structure of LEN (1.8 A) has been solved 
(PDB Accession No. 1LVE) (11). Both SMA and LEN 
belong to the kIV family of Igs. 

The amyloidogenic light chain, SMA, is significantly less 
thermodynamically stable than LEN under all conditions 
(unpublished observations). The presence of low concentra- 
tions of denaturants also results in fibril formation from the 

H ' N present stud iophysical 

characterization of the conformation of SMA as a function 
of pll to reveal the presence of two distinct partially folded 
intermediates: one with relatively nativelike properties, the 
other with relatively unfolded properties. Destabilizing 
conditions at | iy.M*ologi 1 pi I, e.g low urea concentration, 
also lead to aggregation and fibril formation. Thus, conditions 
that result in population of these intermediates lead to 

i gi g; tton, supporting the hypothesis that partially folded 
intermediates are key precursors on the aggregation pathway. 
Interestingly, both amorphous and fibrillar aggregates were 
observed, and were shown to arise from two different 
intermediates. 

MATERIALS AND METHODS 

Expression and Purification of Recombinant SMA. The 
recombinant V L domain SMA was purified from JM83 E 
colt cells transformed with the plasmid pklVsmaCHM, gener- 
ously provided by Dr. Fred Stevens, Argonne National Lab 
(8). The plasmid construct was based on the pASK. vectors 
constructed by Skerra et al. (13) and contained an ompA 
leader for periplasmic localization of the protein to ensure 
the formation of the core disulfide bond. The overexpressed 
protein was purified using the procedure of Stevens et al. 
(8) with minor modifications. Briefly, the recombinant 
protein was extracted from the periplasm using osmotic shock 
via treatment with ice-cold TES followed by distilled water. 
The periplasmic extract was dialyzed against 4 changes of 
20 volumes of 10 mM acetate buffer, pH 5.6, and loaded 
onto a fast-flow SP Sepharose column. The column was 
vashed » ; ith 0 nM accl te i fei pllS.f md he pi tcin 
lute i > n M phosphate buffer, pl l 8 fit iction 
were assayed by SDS— PAGE, and fractions containing the 
recombinant protein were pooled, filtered through 0.22 um 
filters, and stored in glass vials. Typical yields were 7—8 
mg of purified protein per liter of cells. Protein concentrations 
were measured via optical density at 280 nm using the 
extinction coefficient of E 0 \% = 1.8 calculated from the 
sequence. The purified protein was stored in 10 mM 
phosphate buffer (pH 8.0) at 4 °C and used within 2 weeks 
of the initial purification. The purity of the protein prepara- 



tions was assayed by SDS— PAGE and by eleetrospray mass 
spectrometry (Micromass Quatlro II). 

Intrinsic Tryptophan Fluorescence Measurements. Fluo- 
rescence measurements were made with a FluoroMax-2 
fluorescence spectrometer (Jobin Yvon-Spex). Emission 
spectra between 300 and 420 nm were collected with 
excitation at 280 nm. Spectra were collected at different pHs 
within the range of 10—2 using 0.5 fiM protein samples in 
50 mM of the appropriate buffer containing 100 mM Nad. 
Spectra were collected at 25 and 37 °C. The stability of SMA 
toward urea denaturatiou was monitored as a function of pH 
by recording changes in tryptophan fluorescence intensity 
upon excitation at 280 nm and emission at 350 nm at 25 °C. 

Samples of SMA (1 fiM monomer) were incubated in 20 
mM phosphate buffer (pH 7.4), 100 mM NaCl containing 
varying amounts of urea (0—8 M) for 2 h to ensure 
completion of the unfolding reaction. Data were analyzed 
by nonlinear least-squares fitting to a two-state folding model. 
The fraction unfolded, F, d , was calculated using the equa- 
tion: F a — O'r — J')/(Vf — i'u) where y represents the observed 
fluorescence at a particular concentration of urea, and >g and 
v„ represent the corresponding fluorescence of the folded and 
unfolded states, respectively, at that urea concentration. For 
baseline fitting, a linear least-squares analysis was performed 
to determine the values of /j> t - and v u as a function of urea 
concentration. The free energies of unfolding were calculated 
as a function of urea concentration using the equation: AG 
= —RT\n A t , : . where K., A —fj{ 1 -/„). AG° was determined 
by linear extrapolation to zero urea concentration using the 
expression: AG' 0 = AG + mfurea]. 

ANS Binding. 1,8-Anilinonaphthalenesulfonate was ob- 
tained from Kodak, and a stock solution was prepared by 
dissolving in water followed by filtration through a 0.2 ftm 
syringe filter and measuring the concentration using an 
extinction coefficient of 5000 M~> cm-' at 350 nm. The 
luorescence emissi p 1 i ' 1 V] ANS 

and 0.5 uM protein were collected between 420 and 600 
nm upon excitation at 380 nm as a function of pH at 37 °C. 

Circular Dichroism Spectra. CD spectra were collected 
on an AVIV model 62DS spectrometer between 260 and 190 
nm for the far-UV region and between 320 and 250 nm for 
the near-UV region, with a step size of 0.5 nm and an 
averaging time of 5 s and collecting 5 repeat scans. Cells of 
1 and 0.01 cm path length were used for near- and far-UV 
CD measurements with protein concentrations of 0.5 and 1.7 
mg/mL, respectively, 

pH Dependence. pH-dependent changes in spectroscopic 
data were fit using a modified Henderson— Hasselbalch 
equation for one (eq 1 ) or two (eq 2) transitions, to determine 
the midpoints of the transitions: 

Y H + Y, 10 pH " t>,w 



I QpH-pH„ a 



1 + 10* >H—pH,! ' 2 -f- 



j QpH-pH ra2 



where Y obs is the observed spectroscopic property, /'n is the 
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value of the spectroscopic property for the native state. fi N 
is the spectroscopic property for the nativelike intermediate, 
and Yi,, is the spectroscopic property for the unfolded-like 
intermediate. pH m] and pH m2 are the midpoints of transitions 
from the native state to the I N intermediate and from In to 
the Iu intermediate, respectively (36). 

Thin film AIR-FT1R measurements were performed using 
a SPEC AC out-of-compartment ATR accessory and a Nicolet 
800SX FT1R bench. A germanium crystal IRE was used for 
ii h} ' I i i ot —50—100 «g of protein from 

both soluble and insoluble protein as previously described 
(14, 15). ATR-FTIR spectra were collected followed by- 
Fourier transformation of the sample spectra using a clean 
crystal spectrum as a background. The water vapor spectrum 
was collected by reducing the air purge and subtracted from 
the protein spectrum until the spectra were featureless in the 
region between 1 700 and 1800 cm"" 1 . ATR-FTIR spectra for 
SMA were collected at pi t 7.5 in 50 mM sodium phosphate, 
100 m.M NaCI. at pH 5.0 in 50 mM sodium acetate, 100 
mM NaCI, and at pH 2.0 in 20 mM HC1, 100 mM NaCI. 
Buffer spectra were subtracted from the sample spectra. 
Component' ec! were rtained i si lete nil i pt A 
positions using both second-derivative and Fourier-self- 
deconvoluted spectra, followed by curve-fitting to the raw 
spectrum ( 16). 

In Vitro Fibril Formation Assays. Fibril formation was 
monitored using a fluorescence assay based on the enhanced 
fluorescence of the dye Thioflavin T (TFT) on binding to 
amyloid fibrils (17). Amyloid fibrils were grown from 
purified protein (4Q,uM) in 50 mM buffer and 100 mM NaCI. 
A filtered protein sample (using 0.22 ,«m syringe filters) was 
incubated under the desired conditions in a 1.8 mL flat- 
bottomed screw-capped glass vial with moderate stirring 
using a Teflon-coated micro-stir bar. Typical fibril growth 
experiments involved incubating the protein at 37 °C with 
constant stirring and removing aliquots (10 mL) over time 

il 1 I 1 I 1 J i I v 1 g 

(see b ). St nd F i rs ii aided 20 mM HCI or 
I hospl ire (pH 2). 50 mM formate (pH 3 and 4), 50 mM 
cacodylate or acetate (pl l 5 and 6). and 20 mM TRIS or 20 
mM HEPES or 50 mM phosphate (pH 7). Both Rayleigh 
light scattering and fluorescence spectra were collected using 
a SPEX/Jobin-Yvon Fluoromax-2 spectrofluorometer. Con- 
stant temperatures were maintained using a circulating water 
bath. At each time point, a 220 jih sample was removed 
and transferred to a cylindrical quartz microcell to measure 
Rayleigh light scattering at 330 nm with a 1 nm band-pass 
for both excitation and emission monochromators. Thioflavin 
T binding assays were conducted by adding sample aliquots 
(10 pL) to 990 pL of 20 pM TFT in 50 mM TRIS. pH 7.5. 
and 100 mM NaCI in a 1 mL fluorescence cuvette. 
Fluorescence emission was monitored with excitation at 450 
nm using a 5 nm band-pass on both the excitation and 
emission monochromators. Fluorescence intensities were 
reported at 482 nm. 

SAXS Measurements. Small-angle X-ray scattering mea- 
surements were performed on beam line 4-2 at the Stanford 
Synchrotron Radiation Laboratory (SSRL) as described 
previously (IS). The SAXS instrument was configured with 
a Mo:CB 4 multilayer monochromator, an 1 8 mm beamstop, 
and a 218 cm sample-to-detector distance. Data were 
collected on protein samples ranging from 0.5 to 10 mg/'mL 



in 50 mM buffer and 100 mM NaCI at 37 °C using a PTFE 
flow-cell with 1.3 mm path length to minimize radiation 
damage. Radii of gyration were calculated using the Guinier 
approximation (19). 

Atomic Force end 'Transmission Flection Microscopy. 
Transmission electron micrographs were collected using a 
JEOL JEM-I00B microscope operating with an accelerator 
voltage of 80 kV. Typical nominal magnifications ranged 
from 27000x to 67000x. Samples were deposited on 
Formvar-coated 300 mesh copper grids and negatively 
stained with freshly prepared 2% aqueous uranyl acetate. 

For AFM, aliquots of 50 pL of incubation solution were 
transferred to an Eppendorf tube and spun to pellet the 
precipitated material, which was then washed twice with 
water before resuspending in deionized water. A drop of 
aggregate suspension was deposited on freshly cleaved mica 
and dried immediately with nitrogen gas. The samples were 
imaged with an Autoprobe CP AFM (Park Scientific, 
Sunnyvale, CA) in the noncontact (NC-AFM) mode. The 
tube scanner was a 100 ,#m Seanmaster (Park Scientific). 
NC Ultralevers (Park Scientific) were used as cantilevers. 
The resonant frequency was ~100 kHz. The images were 
token in air. ambient conditions, at a scan frequency of 1—2 
Hz, using silicon nitride tips. 

A A IR Spectroscopy. ' 1 1 NMR spectra were collected using 
a Varian Unity-f 500 spectrometer equipped with ultrashims 
and a Varian triple resonance probe. Presaturation and 
postacquisition digital filtering were used for solvent sup- 
pression. Data were collected on 0.5 mM protein samples 
containing 100 mM NaCI and 10% D;0. The pH was 
adjusted by titration with NaOFI or HCI as needed. All data 
were recorded at 37 °C. 

/ ;/;/ ca t / s 5 ncentru 

tion. The rate of aggregation was monitored by static light 
l tici i ising a phel ;kan instrument from Labsystems 
and a 96-well plate reader. Solutions of 3.5 and 7 mg/'mL 
SMA at the appropriate pFI were incubated at 37 °C along 
with their corresponding buffers, and scattering was measured 
every 15 min for 6 h. 

pH Jump Experiments. Interconversions between N, I N . 
Iu, and U were monitored by diluting 10 uM SMA at one 
pH into buffer of another pH. such that the final concentration 
of protein was I fiM. After manually mixing, the intensity 
of either tryptophan fluorescence (excitation 280 nm and 
emission 345 nm) or ANS fluorescence (excitation 380 nm 
and emission 470 nm) was monitored, using a time-based 
scan on a Spex Fluoromax instrument with 1 s time- 
averaging. 

RESULTS 

i i it ' i i 1 1 i 

' i i i rhe enhanced fluorescein n 

sion of the dye Thioflavin T on association with amyloid 
fibrils provides a very convenient method to monitor the 
kinetics of amy bid fibril formation (17, 20). Fibril formation 
by SMA was investigated by stirring solutions of SMA at 
various values of pH at 37 °C. At room temperature, or in 
the absence of stirring, no enhancement of TFT was observed 
for several days. 

The rate of fibril formation from SMA was found to be 
very sensitive to a number of extrinsic factors, including pH. 
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agitation, and temperature. As is typical for other amyloid 
fibril growth curves, SMA fibrillation kinetics exhibit ;i quasi- 
i ii 11 i unci of a lag phase followed by a 

log irithmic growth phase that eventually plateaus (Figure 
1). The slight drop in TFT fluorescence sometimes observed 
at long time periods may reflect conversion of the mature 
SMA amyloid fibrils to an alternative fibrillar form that has 
a lower affinity toward TFT. For example, we frequently 

werved lateral regation nature fibrils lin< 
the decrease in TFT fluorescence, suggesting a possible 
decrease in the availability of TFT binding sites. 

Wc confirmed that the enhanced fluorescence emission 
of TFT was indeed due to interaction with SMA amyloid 
fibrils by the characteristic Congo red green-birefringence 
observed under crossed-polarization (data not shown) and 
by direct observation with transmission electron microscopy 
and atomic force microscopy. These techniques demonstrated 
the presence of fibrils in systems with high TFT fluorescence, 
and their absence in samples with no increase in TFT 
fluorescence. The TFT assay was shown to be linear over 
the concentration range of 0 to > 12 fig of amyloidogenic 
i ein i the assa) Li ti 1 , 

fhe initial i bril nation (see Figure 1 ) i 
attributed to the slow assembly of a critical nucleus in a 
nucleation— polymerization mechanism (21). The length of 
the lag (measured by extrapolating the exponential growth 
phase to zero intensity) during SMA fibril formation was a 
sensitive function of the pH at which the fibrils were 
generated. Some typical data are shown in Figure 1. The 
length of the lag decreased from days at pH 7.0, 37 °C ,for 
40 ztM SMA, to a few hours at pH 2 (these values for the 
lag time are very sensitive to the rate of stirring or agitation). 
In addition, the maximum signal obtained using the TFT 
assay increased with decreasing pM, indicating that more 
fibrils were formed at lower pll values (note that the TFT 
binding assay is performed at pH 7). 

static light scatteri va s ed t > >r the kinetics 
of regal (Figur ). Sti risingly. we noted that 
amorphous aggregation occurred in the same incubation 
samples of SMA as fibril formation, but at a faster rate. 
Substantial amorphou iggr i ved from pH 

7 to 4. The amount of amorphous aggregation was propor- 
tional to the protein concentration. The amorphous aggrega- 
tion was observed immediately after starting the stirring at 
37 °C as indicated by increased light scattering (Figure 
1B,C). whereas the presence of fibrils, as reflected by the 
increase in TFT fluorescence, was not observed for several 
days. Confirmation of the fact that this early aggregation 
was indeed amorphous comes from TEM and AFM micro- 
graphs that showed amorphous material and the absence of 
fibrils (Figure 2). Under certain conditions, e.g., pH 5.0. 37 
°C, 100 mM NaCl, 50-60% of the SMA had precipitated 

tmorphou 1 i r 24 h ol bati tse 

on the absorbance of the supernatant), and no fibrils were 
visible by microscopy. In contrast, at pH <3 essentially ail 
of the precipitate was fibrillar within 24 h. The rate of the 

rea 1 1 v. i ' ' ed at pi I 2 correlated 

with that of the increase in Thioflavin T fluorescence < Figure 
1A), suggesting that the predominant species present in 
solution was fibrillar. This was confirmed by TEM, which 
indicated that the aggregates were largely fibrillar, though 
some amorphous material was present. Interestingly the 
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Figure 1: pH dependence of amyloid fibril formation by recom- 
binant V L domain SMA. Fibril formation was monitored using 
Thioflavin T emission at 482 nm upon excitation at 450 nm at pH 
2 (A, A), 5 (B, □), and 7 (C, O). Rayleigh light scattering was also 
monitored at pH 2 (A, A), pH 5 (B, ■), and pH 7 (C, •). The 
formation of amorphous aggregates precedes the formation of 
amyloid fibrils at pH 7 and 5, conditions that favor the native or 
the nativelike intermediate conformation. The inset to panel A 
shows an expanded time scale. Light scattering is sensitive to the 
presence of both amorphous and fibrillar material, whereas TFT 
fluorescence is selective for fibrillar aggregates alone. Panel D 
shows that in the absence of agitation, at SMA concentrations as 
high as 0.5 mM, no aggregation occurs over at least a 6 h period: 
circles are for pH 7, triangles for pH 2. With agitation, the signal 
would be > 1400 due to the aggregation. 

maximum increase in Thioflavin T fluorescence was sig- 
nificantly less at pH 7 compared to pH 5 and pH 2. which 
also correlated with fewer fibrils observed by microscopy 
at pH 7 compared to the lower pl l conditions. 
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Figure 2: Atomic force microscopy images of SMA fibrils 
observed at pH 2 after 24 h of stirring at 37 °C (A, top). At pH 5 
and 7 after incubation for 1 h at 37 °C with constant stirring, a 
nebulous and loosely packed amorphous deposit is observed (B, 
bottom). After a few days at pH 5 and 7, both fibrils and amorphous 
deposits are observed. 

Morphology of Fibrillar and Amorphous Deposits. The 
SMA aggregates were examined using atomic force (Figure 
2) and transmission electron microscopy. The images re- 
vealed unbranched, rope-like fibrils (Figure 2A), several 
hundred nanometers in length, most with diameters of ~8 
nm, but some, protofibrils, with diameters of ~4 nm (7). 
Upon incubation at 37 °C at pH 4-6, amorphous aggregates 
were observed (Figure 2B). 

Spectroscopic t 'haracterization of Acidic pH Conforma- 
tions of SMA. A number of spectroscopic probes were utilized 
to examine conformational changes in SMA that occur under 
conditions favoring the formation of amorphous aggregates 
(pH 4-6) and amyloid fibrils (pH <3). Both amyloid fibrils 
and amorphous aggregates are only formed in solutions of 
SMA at 37 °C upon agitation of the solution. Note that all 
the spectroscopic analyses were done at 37 °C without 
agitation within 2—3 h of preparation, ensuring that only 
soluble equilibrium conformations were studied; i.e., neither 
amorphous nor fibrillar aggregates were present in the 
spectroscopic analyses. No light scattering was observed for 
at least 6 h for solutions of SMA at concentrations as high 
as 7 mg/mL (used in the NMR experiments) at various pH 




Figure 3: Intrinsic tryptophan emission spectra were measured 
with excitation at 280 nm for 0.5 fiM protein solution in 50 mM 
buffer and 100 mM NaCl at 37 °C for native SMA at pH 7.5 (♦), 
denatured SMA in 8 M urea at pH 7.5 (O), and hj at pH 2 (A). 
The wavelength of maximum emission of tryptophan fluorescence 
(panel B) and emission intensity (panel C) are plotted against pH. 
The solid lines are fits to a single ionizable group pH transition 
using eq 1 (see Materials and Methods). The midpoint of pH 
transitions for emission maximum and the intensity at 345 nm were 
5.6 and 3.5, respectively. 



values from 2 to 7 (Figure ID). Tertiary structure changes 
were monitored by tryptophan fluorescence emission, near- 
UV CD, far-UV CD (via the 230 nm peak resulting from 
aromatic clustering), and by ANS binding studies. Secondary 
structure changes were monitored by far-UV CD and Fourier 
transform infrared spectroscopy. 

The two tryptophan residues of SMA, VV35 and W50, 
provided convenient spectroscopic means for assessing the 
protein's conformational state. In particular, W35, which was 
quenched by the close proximity of the core disulfide formed 
from C23 and C88 in the native state, was observed to 
provide a probe of the global conformational state of the V t . 
domain. Unfolding of the protein resulted in a decrease of 
the quenching and a consequent increase in the emission 
intensity of W35 (Figure 3A, and ref SI). The second 
tryptophan residue, W50. was solvent-exposed in the native 
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state, based on both the ciystal structure of LEN and the 
observed A max of 347 nm in native SMA and LEN. 

The intrinsic tryptophan fluorescence spectra indicated A m;K 
of 347 and 355 lira for native and denatured SMA (8 M 
rea cspeclivel I n ; \). In addition, denatured SMA 
showed a large increase in Trp fluorescence intensity 
compared to that of its native conformation. When the pH 
of a solution of SMA was lowered from 7.5, significant 
changes were observed in the tryptophan fluorescence 
emission properties. In the pH 4—6 region, there was a 
decrease in A„, !K . from 347 to 345 nm (Figure 3B), which 
ittn id e m 1 i i f i i tidily folded intermedi- 
ate. For reasons to be discussed below, this intermediate was 
called In (for nativelike intermediate). These data were fit 
to eq 1, and the midpoint of this pH transition was calculated 
to be 5.6. At pH <4, the emission intensity increases but 
the l nmx docs not change further (Figure 3B,C). The increase 
in emission intensity observed from pH 5 to 2 suggests 
significant disassembly of the hydrophobic core of the 
protein. The midpoint of this transition was pH 3.4 and 
appeared to be cooperative (Figure 3C). The fluorescence 
spectrum of SMA at pH 2 had a fluorescence intensity 
between th 

shift in the maximum emission to 345 nm. This suggested 
! ntial ii r t U pH 2 This 

conformation of SMA, populated below pH 3. was called 
lu- In addition, the fluorescence spectrum of SMA was 
measured as a function of protein concentration, over the 
0.05—0.5 mg/mL range, to confirm the presence of the two 
intermediates and to demonstrate the lack of association of 
the samples under the experimental conditions. 

The near-UV CD spectrum of SMA contained significant 
contributions from the aromatic (tryptophan, tyrosine, and 
phenylalanine) residues that were sensitive to the tertiary 
structure of the protein. The near-UV CD spectrum of native 
i i \ * ' i i 1 

> ! . like fit i d nl v ns >n t iroi a 
clusters involving residues Y36, Y86, Y87, F98 and 
Y 27(d), Y32. Y49, Y91. Y92, observed in the crystal 
structure of LEN which would als< be - r < u I to bt pi cut 
in SMA. The peaks at 286 and 296 nm disappeared with 
transition midpoints of pH 3.2 and 3.4. respectively (Figure 
4B). These transitions correspond to the formation of the I,; 
species, suggesting loss of the nativelike environment of the 
aromatic groups in this intermediate. The small positive 
ellipticity for SMA at 268 nm showed transitions with 
midpoints of 4.9 and 3.7, corresponding to the transitions to 
I N and Iu, respectively (Figure 4B). At pH 2. the near-UV 
CD spectrum of SMA was essentially featureless (Figure 
4A). suggesting loss of most of the tertiary structure in I t >, 
including the aromatic clusters. As a whole, the near-UV 
i ' pectra rot SM \ in the pH 4—6 region indicated that 
the underlying tertian' structure is still relatively nativelike 
in this pH ran _„•. consi tei t with the presence of a nativelike 
conformation in 

The hydrophobic dye ANS has frequently been used as a 
probe to reveal the presence of partially folded intermediates 
due to the presence of increased exposure of contiguous 
hydrophobic surface area (22—24). ANS did not significantly 
bind to SMA in its native state, indicating the absence of 
exposed hydrophobic pockets. However, a pl l-titration of 
SMA in the presence of ANS at 37 °C revealed a marked 
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Figure 4: Near-UV CD spectra for SMA. Panel A shows plots of 
molar ellipticity for pH 7.5 (•), pH 5 (□), and pH 2 (A). Panel B 
shows the pH dependence of the molar ellipticity at 296 nm (V), 
287 nm (O), and 268 nm (□). The lines through the data are fits to 
eq 1 for peaks at 296 and 287 nm, yielding apparent pK 3 s of 3.2 
and 3.4, respectively, which correspond to formation of Iu. The 
data at 268 nm are fitted to eq 2, resulting in two apparent pK* 
values of pH 4.9 and 3.7. Protein concentration was 0.7 mg/mL. 

increase in the fluorescence emission and a blue shift in the 
ANS emission maximum from 510 to 480 nm. Both the 
emission intensity increase (data not shown) and the blue 
shift of the emission were indicative of exposed hydrophobic 
regions in the vicinity of pH 4—6. with a maximum at pH 
4.5 (Figure 5). Such an observation was taken to indicate 
the build-up of a partially folded intermediate, I*. The 
midpoints of the transitions observed for SMA were at pH 
5.2 and 3.8 (Figure 5). The limited ANS binding at low pH 
was attributed to the second intermediate. l r . which appears 
to have less contiguous exposed hydrophobic regions. The 
data suggest that the 1 N intermediate is maximally populated 
between pH 4 and 5 for SMA. In contrast, no ANS binding 
was observed in the pH 2—10 range for the nonamy- 
loidogenic honiologue LEN, indicating the absence of both 
intermediates with LEN (Figure 5). In addition, the structure 
of LEN at pH 2 is nativelike, based on small-angle X-ray 



Partially Folded Intermediates and Light Chain Amyloidosis 



Biochemistry. Vol. 40. No. 12, 2001 3531 




FIGURE 5: pH dependence of binding of ANS to SMA (•) and 
LEN (O). The reaction was monitored with 0.5 /iM protein solution 
and 10;<M ANS, with excitation at 380 nm. Fluorescence emission 
spectra were collected between 400 and 600 nm at different pHs. 
The solid line is a fit to two transitions using eq 2 (see Materials 
and Methods). The midpoints of the pH transitions were 5.2 and 
3.8, respectively. 



scattering, circular dichroism, FT1R, and Trp flue 
(unpublished observations). 

Circular dichroism spectra were collected for SMA from 
pH 8 to 2 to probe global secondary structural changes in 
the different conformers. The far-UV circular dichroism 
spectrum of native SMA at pH 7.5 was rather unusual, in 
that it had distinct minima at 230 nm. as well as at 216 nm 
(Figure 6A). The former was attributed to contributions from 
aromatic interactions and possibly the disulfide, the latter to 
^-structure. When the CD spectrum of SMA was examined 
as a function of pH at 37 °C, there were relatively small 
changes between pH 7.5 and pH 4. with more significant 
changes occurring at lower pH (Figure 6). The former are 
consistent with loss of some tertiary structure in I N . The 
spectrum at pH 2 was significantly different from the 
spectrum of SMA denatured in 7 M urea or 5 M Gdn-HCl 
it pH 4, indicating n icant structure at pH 2. Plots of 
the ellipticity at 230 nm against pH reveal the population of 
a distinct conformational species in the pH 4—6 region 
(Figure 6B). The ellipticity of SMA monitored at 216 nm 
(corresponding to /3-sheet structure) indicated no change 
between pH 7.5 and 4.0, suggesting that lu is relatively 
nativelike. However, below pH 4.0, the negative ellipticity 
at 2 1 6 nm shifted toward lower wavelengths, consistent with 
the loss of /7-sheet structure (Figure 6C). The spectrum at 
pH 2.0 was a mixture of contributions from /?-sheet and 
random coil conformations, indicating some loss of native! ike 
^-structure at pH 2. The data were consistent with the 
presence of a relatively unfolded-Iike intermediate, lu, at pH 
below 3. 

Fourier transform infrared spectroscopy (FT1R) has been 
used to probe protein structure, and the amide 1 band ( 1600— 
1700 cm -1 ) has been used to estimate protein secondary 
structure content (25). The ATR-PTIR spectra of hydrated 
thin films of SMA at pli 7.5 and 5 revealed that significant 
secondary structural changes occur for the In intermediate 
compared to the native state (Figure 7). The major differences 
are an increase in low-frequency /3-sheet (1625 cm" 1 ), an 
increase in disordered structure (1648 cm" 1 ). increased turn 




FIGURE 6: Far-UV CD spectra of SMA as a function of pH. Panel 
A shows the spectra at pH 7.5 (•), pH 5 (□), and pH 2 (A). The 
changes in molar ellipticity at 230 nm and at 216 nm are plotted 
against pH in panels B and C, respectively. The solid line in panel 
B is a fit to eq 2 and gives apparent pAT„s of 6.3 and 3.7. The solid 
line in panel C is a fit to eq 1 and gives an apparent pK* of 3.5. 

(1672 cm" 1 ), and a small decrease in the 1695 cm" 1 
/^-component. 

At pH 2.0. the amide 1 spectrum of lu is different from 
that at pH 7.5 or 5.0, indicating that lu has a different 
secondary structure from native and I N . The major changes 
are a large increase in the looplike structure at 1660 cm~ ! 
and loss of the major $-peak at 1638 cm -1 in the native 
spectrum, which is replaced by a new, lower intensity 
^-component at 1631 cm" 1 . 

'H NMR spectra were collected to further assess the 
conformational changes that took place in SMA at intermedi- 
ate and low pH. As shown in Figure 8A, the NMR spectrum 
of SMA at pH 7.0 is characteristic of a tightly folded protein, 
having well-dispersed amide, aromatic, and aliphatic proton 
resonances. As the pH was reduced to below pH 5 (Figure 
8B), only minor changes in the spectra were observed. These 
changes included both sharpening of many resonance lines 
as well as changes in some amide proton chemical shifts. 
However, the spectrum was still characteristic of a well- 
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Figure 7: Amide I region of the FTIR spectrum of SMA. Panel A 
shows the spectrum for native SMA at pH 7.5. Panel B shows the 
spectrum of the nativelike intermediate (I N ) at pH 5.0, and panel C 
shows the spectrum of the unfolded-like intermediate (l v ) at pH 
2.0. The raw ATR-FTIR spectra after water vapor subtraction are 
shown as thick lines. The thin lines in each panel are the component 
spectra obtained after curve-fitting the raw spectra. 

folded protein. When the pH was adjusted to below 3, 
however, the chemical shift dispersion was lost, with the 
amide proton resonances collapsing to an envelope less than 
1. 5 ppm wide (Figure 8C). The upfield methyl resonances 
were also lost below pH 3, with both the aromatic and 
aliphatic proton regions of the spectrum showing consider- 
able loss of dispersion. Notably, no resonances corresponding 
to either of the tryptophan indole moieties were apparent in 
the low-pH spectra. Additionally, the spectrum at pH 2.0 
was considerably different from that recorded for SMA under 
strongly denaturing conditions (pH 2, 8 M urea) as shown 
in Figure 8D. Under these strongly denaturing conditions, a 

i- lisp rt occurs throughout the spectrum 

and significant changes in the chemical shifts of nearly all 
of the amide piotons coin Vs il ti ^ 1 1 . "i,l.J. 
the N.VIR spectra -.how an increase in signal-to-noise for the 
same concentration of protein. This is likely due to dissocia- 
tion of the V L dimer (A" (( = 40 fiM at pH 7) at lower pH. 

Small-Angle X-ray Scattering Characterization of SMA 
( "onfonvatums. Small-angle X-ray scattering measurements 
indicated that SMA became less compact as the pH was 



A. 




Figure 8: l H NMR spectra of SMA. Panel A shows the spectrum 
of the native protein at pH 7. Panel B shows the spectrum of the 
nativelike intermediate, I N , at pH 5, and panel C the spectrum of 
the relatively unfolded intermediate, l u , at pH 2. For comparison, 
the spectrum of the unfolded protein in 8 M urea, pH 2, is show n 
in panel D. 

reduced from 7 to 2 (at a protein concentration of 80 «M), 
with an increase in li t from 19.6 ± 0.4 A at pH 7 to 26.8 ± 
0.6 A at pH 2. At pH 5. the protein was only slightly 
expanded, having a R., of 20.5 ± 0.6 A. Kratky plots of the 
scattering data indicated that extensive globularity was 
maintained even at low pH, although some denaturation was 
apparent at pH 5 and more so at pH 2. The significant 
compactness of lu (R s — 26.8 A) is apparent by comparison 
of its /? s to that of the fully unfolded protein (It, > 30 A). 

1 1 U Mi 

ity of SMA was measured at different pHs using urea 
monitored by intrinsic tryptophan fluorescence 
and far-UV CD. Tryptophan fluorescence had the advantage 
that it permitted the use of low protein concentrations, which 
eliminated potential aggregation problems during unfolding. 
At pH 7.5, SMA is only marginally stable, with a free energy 
of unfolding A£7° = 4.8 kcal/mol and m = 1.05 kcal/'mol. 
Similar equilibrium plots were obtained when starting with 
either native or unfolded protein, indicating that there was 
no hysteresis in the urea unfolding transitions. As the pH 
was decreased, the stability of SMA decreased significantly 
(Table I ) as indicated by the decrease in the midpoint of 
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Table 1: Effect of pH 


on the Stiibilily of SMA" 


pH 




C n (M) 


2 




~0 






1.5 ± 1.5 






3.2 ±0.1 


8 




4,0 ±0.1 


10 




3.5 + 0.1 


- Midpoints of uvea 


unfolding transi 


lions vvere monitored by Trp 


fluorescence. 







the urea unfolding transition with decrease in pH. In light 
of the evidence for formation of a non-native species from 
SMA in the vicinity of pH 4-6 (I N ), the urea unfolding data 
were not converted into free energy data except for neutral 
pH. 

Kinetics of Interconversion of Intermediates. The rates of 
interconversion between the native conformation (N) and the 
partially folded intermediates M-, and i,>) were monitored by 
pH jumps, in rc< in botwi n N and l N - were moni- 

tored using changes in the ANS fluorescence and jumping 
the pif of solutions of SMA from 7 to 5 (N to l N ) and from 
1 to v Interconversion 11 >1 " g i ei u II >•> id 
by observing changes in the tryptophan fluorescence, with 
pH jumps of 7 to 2 (N to 1 L ). 5 to 2 (1 N to lu), 2 to 5 (lu to 
In), and 2 to 7 to N). The results from these experiments 
show that conversion of N to I N and I N to N are fast 
processes, complete within the dead time of manual mixing, 
indicating that the rate constant is >0.35 s~'. On the other 
hand, conversion of either N or I N to Iu and the reverse are 
slower processes. The rates for both N to lu and I N to I u 
were the same within experimental error, namely. 0.0 i ± 
0.002 s~', consistent with I N lying on the pathway between 
N and lu. The rates of the transformation of Iu to either I N 
or N were the same, with a rate constant of 0.002 ± 0.0008 
s"" 1 . again consistent with I N lying on the pathway between 
N and Iu. 

DISCUSSION 

There is increasing evidence to suggest that protein 
in teltid I id fibril formation, arises from 

a partially folded conformation of the aggregating protein 
(5, 6, 26—30). The present data strongly support such a 
i i i i ■ i t » i tunoglobulin light chain 

variable domains. Protein aggregation has generally been 
regarded as being driven by nonspecific hydrophobic interac- 
tions operating on unfolded or collapsed molten-globule 
states. On the other hand, extracellular aggregation of some 
proteins to form amyloid fibrils has been conventionally 
attributed to mutations altering the local surface properties 
of the native state, thereby introducing new packing interac- 
tions for oligomerization of the native state (26. 31). In the 
case of SMA. our results clearly indicate that a nativelike 
conformation in the fibrils is highly unlikely. Thus, the model 
for light eh iii tibi \ciis and co- 

workers (31), and based on the native structure, is incon- 
sistent with the experimental observations. 

The partially folded conformations that are the critical 
precursors to protein aggregation may arise either during the 
folding of newly synthesized proteins, as with inclusion 
bodies, or from the native state, as appears plausible for the 
extracellular amyloid deposits. The build-up of the soluble 



precursor that triggei ig i gallon ould involve a combina- 
tion of factors including an amino acid sequence leading to 
a relatively less stable native state as compared to that of a 
nonaggregating variant and/or the presence of mildly desta- 
bilizing extrinsic conditions. 

Detailed analysis of the properties of the imj loid menic 
• e tding it ibil iditions cess ' . | it 
partially folded intermediate conformations, propensity to 

»re a- md inetics of iggtegation and fibril formation, 
provides insight into the molecular basis for aggregation. The 
present results raise a number of interesting questions, such 
as: Which features are responsible for the propensity to form 
fibrils? Why are both amorphous and fibrillar aggregates 
formed, and what is the relationship, if any, between them? 
Mow do the two partially folded intermediate conformations. 
In and Iu, fit into the picture? We will begin with this last 
question. 

Destabilizing Conditions Lead to Partially Folded Inter- 
mediate ('onjormatiom Mildly destabilizing conditions, such 
as low pH or low urea concentrations (data not shown), lead 
to enhanced aggregation and fibrillation of SMA. The results 
of the spectroscopic investigations of SM A as a function of 
pH reveal that SMA forms two partially folded conforma- 
tions. K. and lr. the forinei being relatively nativelike in its 
structural properties, whereas the latter is considerably more 
unfolded. The major significance of the observation of these 
species is the correlation between formation of amorphous 
aggregates and l N , and formation of fibrils and I 0 . Both I N 
and lu are envisaged as ensembles of conformations that 
retain some nativelike structure (more in In and less in lu) 
with the remainder of the chain, especially for ly. being 
highly mobile, and disordered but biased toward its native 
conformation. 

The near- and far-UV circular dichroism, Trp fluorescence, 
NMR„ and SAXS data all point to I N as being a relatively 
nativelike species with most structural properties similar to 
those of the native state. The significant increase in ANS 
binding, however, points to the critical feature of this 
intermediate, namely, increased exposure of hydrophobic 
surfaces compared to the native state. The increased negative 
ellipticity in the far-UV CD at 230 rati is related to this as 
it probably represents minor structural rearrangements in 
side-chain packing manifested as changes in the CD contri- 
bution of aromatic residues, rather than secondary structure 
changes. The enhanced ANS binding in the pH 4—6 range 
is very consistent with the population of a partially folded 
intermediate (22. 23). Likewise, the FTIR spectrum for I N 
reveals that although the overall secondary structure is quite 
similar to that of the native state, nevertheless there are 
significant structural differences. These include increased turn 
structure and a shift in some of the ^-structure to components 
with a lower frequency band, perhaps signifying changes in 
the jS-strand interactions. From examination of the spectral 
probes as a function of pH, it is apparent that there is a 
structural transition with a midpoint around pli 5.5. This 
transition, which corresponds to the interconversion of the 
native state to could reflect the titration of histidine or 
carboxylate residues in SMA. The pll-dependent transition 
from N to In was not observed for the nonamyloidogenic 
LBN (Figure 5 and unpublished observations). The only 
differences in ionizable side chains between LEN and SMA 
are two histidines present in SMA (8); this suggests that one 
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or both of these His residues, either directly or indirectly, 
may be responsible for the transitions between N and I N . 

The data indicate that SMA forms a second, more 
unfolded, intermediate, l tJ , at pH <3. From comparison of 
the spectral probe signals as a function of pH, a common 
transition, attributed to that of l N to h<, with a midpoint in 
the vicinity of pH 3.3 is observed with all the probes. ! his 
intermediate retains substantial compactness, and secondary 
structure, consistent with the presence of a partially folded 
intermediate conformation. Based on the apparent pK of the 
transition between [ N and h this nfi tatioi inge 
apparently governed by the titration of carboxylate groups. 
The fact that at pll 2 the fluorescence signal reflects 
substantial residual structure (sign c nth leci used m isio 
and blue-shitted X mm relative to the unfolded and native 
states) suggests that there is also tertiary structure present 
in ly. However, the near-UV CD spectrum indicates the loss 
of most of the native aromatic side-chain interactions, 
suggesting that the aromatic clusters present in the native 
conformation may no longer be present. The FTIR spectra 
indicate additional loop/disordered structure in l v compared 
to the native and l\ confom al oi ; and also loss of nativelike 
^-structure. The NMR spectrum of this intermediate also 
clearly indicates that it retains considerable secondary and 
tertiary structure. 

Among the factors that stimulate fibril formation from 
SMA are increased protein concentration, agitation, and 
1 i i iture VII these aic likely to lesult in 
increased concentration of non-native conformations (through 
equilibrium between native and non-native states, denatur- 
ation at air-water interfaces, and shifting of the equilibrium 
from native to non-native conformations, respectively), with 
their known propensity to aggregate. These observations 
tien then the correlation betweei i tt i pre i ■ 
partially folded intermediates. 

A minor complication in some of these experiments arises 
from the fact that V'l domains are known to dimen'ze in a 
fashion similar to the association of the CJ\\ domain 
interaction in intact immunoglobulins. Stevens and Schiffer 
(52) demonstrated that native SMA exists in a monomer— 
dimer equilibrium with K. t = 40 «M under physiological 

nditi 1 ' ' data collected vath high concentrations of 
SMA. such as SAXS and NMR, will be potentially compli- 
cated by the presence of these native dimers, at least at 
neutral pH. The main anticipated effect would be that under 
such conditions the equilibrium between the native confor- 
mation and the intermediates would be shifted in favor of 
the native conformation, rather than a non-native one. This 
leads to a potential decrease in the rate of fibril formation 
under conditions where the protein is present mostly as the 
native dimer (unpublished observations). 

delation henn itabUii) uui 1 gre< ition I he I ibil t; 
of SMA is greatly decreased at acidic pH, correlating with 
: h increased amorphous nd fibrillai • e ition I e fact 
that at lower pH SMA readily forms fibrils suggests that 
removal of nativelike interactions is important prior to 
fibrillation. Our data show that the aggregation of the 
amyloidogenic SMA correlates well with the decreased 
stability of the native state and the population of non-native 
conformations. We believe that it is the differential desta- 
bilkation of the native state of SMA relative to the partially 
folded intermediate conformations which is the key feature 



Scheme I 

n «. i N « i u « u 



Scheme 2 




of the amyfoidogenesis, and which will be attenuated in 
nonamyloidogenic light chains. These observations are in 
accord with previous investigations of the correlation be- 
tween amyloidogenesis and stability in light chains, which 
has indicated that either destabilizing mutations or sequences 
(12, 33. 34). or destabilizing conditions (55) correlate with 
increased fibillogenicity. In fact, SMA has been shown to 
be about 2.5 kcal/molecule less stable than its benign LEN 

e Further, it h been owt i t Dili n 
LEN with Gdn-HCl leads to fibrillation (12). 

Amorphous and Fibrillar Aggregates. The observation that 
both fibril and amorphous aggregate formation occur simul- 
taneously under many conditions raises a number of ques- 
tions, for example: Do they arise ti m similai oi different 
partially folded conformations, and do amorphous aggregates 
convert directly or indirectly to fibrils? 

The coincident kinetics for light scattering and TFT al pH 
<3, in conjunction with the limited amount of amorphous 

;gr< ites observed by EM, indicate that at these low pHs 
fibrils are preferentially formed, and that the limited amount 
of amorphous aggregates formed under these conditions 
rapidly converts to fibrillar species. Since the spectroscopic 
results suggest that at pH <3 the only significant species 
present is lu, the limited amount of amorphous aggregates 
indicates that it is lu that is responsible for fibril formation. 
Similarly, the fact that the pH at which maximal amounts 
of amorphous material is found is in the vicinity of 5.5 
indicates that In is primarily responsible for the amorphous 
aggregates. The decreased amplitude of the final TFT signal 
at higher values of pH is attributed to the increased amount 
of amorphous aggregate at higher pHs. 

The kinetics of interconversion between N, In, and hi are 
consistent with In being on the pathway between N and lu 
(Scheme 1 ). However, the possibility that there are separate 
pathways to I N and \ u (Scheme 2) cannot be eliminated at 
this time. The fast interchange between N and i N is not 
surprising, given the fact that I N is relatively nativelike. 

The decreased amounts of fibrils observed at higher pH 
values presumably result from the smaller amounts of lu in 
equilibrium with N and In (even though the equilibrium 
levels of lu may be quite low at higher pHs, any lu lost in 
fibril formation will be replaced by mass action with more 
soluble lu). The strong correlation between the pH depen- 
regation (and tibi l mi the population 

of the partially folded intermediates supports the hypothesis 
that the observed intermediates are key players in the 

it iroec eration. even under nat 

conditions, can he attributed to the Boltzmann distribution 
of ensembles of various states under nativelike conditions. 
Hence, it possible that the key intermediate, highly populated 
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at pH S3, is also present under nativelike conditions, but at 
substantially lower concentration, and is responsible for 
amyloid formation after an extended lag period. 

Aggregation results from the strong self-association ten- 
dency of the partially folded intermediates, probably due to 
the presence of large solvent-exposed hydrophobic patches, 
which are absent in the native and fully unfolded states. The 
increased ^-structure observed in the aggregated states 
reflects ^-strand— /{-strand interactions involved in the 
intermolecular association. Fibril formation is expected to 
involve a number of intermediate states of soluble oligomers 
of partially folded intermediates, potentially populated at very- 
low levels. Aggregation occurs under conditions in which a 
suitably high concentration of the key partially folded 
intermediate is present, due to a combination of destabilizing 
factors, such as pH and temperature, or urea, and amino acid 
sequence, as well as the concentration of the intermediate. 
Thus, it is mostly the intrinsically low stability of SMA that 
leads to the build-up of the intermediate leading to aggrega- 
tion, under conditions where more stable light chains form 
legligible intermedial! i i remain in the native conforma- 

The much more rapid formation of amorphous aggregates 
of SMA. compared to fibrils. Is very interesting, and raises 
anumberofquesti a i ;ardio| he tature of the relationship 
between the initially formed amorphous aggregates and the 
more slowly formed fibrils. Based on the data reported here, 
it is apparent that the partially folded intermediate populated 
in the pH 4—6 region is the direct precursor of the amorphous 
aggregates. The correlation between In and amorphous 
egates. and l n and tibriihi u ue igests that the 
ratio of the two types of deposits is determined, at least in 
part, by kinetic competition between the pathways leading 
to the two different intermediates. A more detailed investiga- 
tion of the relationship between amorphous and fibrillar 
deposits will be given elsewhere. 

The results of the present investigation firmly establish 
the existence of partially folded intermediates as key precur- 
i ti 1 > f the am « 1 

chain variable domain SMA. In addition, the observation of 
two such intermediates is the first report that a given protein 
might have more than one critical intermediate conformation 
on the aggregation pathway, and that such different confor- 
mations may lead to different types of deposits. One 
implication of this is that factors, such as chaperones. which 
may change the effective concentration of one of the 
intermediates may change the nature of the deposits. 
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